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The Applicants appeal the rejection of Claims 6-8 and 11-17 in the above-captioned 
patent application. These claims were rejected in a final Office Action mailed June 22, 2005. 
Applicants filed a Notice of Appeal September 20, 2005. Applicants filed an Amendment after 
final Office Action on September 20, 2005 with the Notice of Appeal. 

L REAL PARTY IN INTEREST 

Pursuant to 37 C.F.R. 41.37(c)(1), Appellants hereby notify the Board of Patent Appeals 
and Interferences that the real party in interest is the assignee of record for this application, 
Genentech, Inc., 1 DNA Way, South San Francisco, CA 94080. 

11. RELATED APPEALS AND INTERFERENCES 

A Notice of Appeal has been filed in the related Application Nos. 10/063,519; 
10/063,534; 10/063,560; 10/063,586; 10/063,587; 10/063,592; 10/063,617; 10/063,661; and 
10/063,713. A Notice of Appeal and an Appeal Brief have also been filed in the related 
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Application Nos. 10/063,661; 10/063,530; 10/063,540; 10/063,578; 10/063,584; 10/063,616; 
10/063,648; 10/063,652; 10/063,653; 10/063,659; and 10/063,660. Appellants are unaware of 
any other related appeals or interferences. 

IIL STATUS OF THE CLAIMS 

The above-captioned application was filed with Claims 1-13. Applicants canceled 
Claims 1-3 and 9-10, amended Claims 4 and 13, and added new Claims 14-17 in an Amendment 
and Response to Office Action dated April 7, 2005. The Examiner finally rejected Claims 4-8 
and 1 1-17 in a final Office Action mailed June 22, 2005. Appellants filed Amendments to the 
Claims with the Notice of Appeal on September 20, 2005, canceling Claims 4-5 without 
prejudice to, or disclaimer of, the subject matter contained therein, and amending Claim 12 to 
depend from Claim 6. Accordingly, Claims 6-8 and 11-17 are the subject of this appeal. The 
claims attached hereto as Appendix A reflect the claims as amended by the Amendment filed 
with the Notice of Appeal. 

IV. STATUS OF AMENDMENTS 

In the final Office Action mailed June 22, 2005, the Examiner indicated that the 
amendments filed on April 7, 2005 had been entered. Appellants filed an Amendment canceling 
Claims 4-5 and correcting the dependency of Claim 12 with the Notice of Appeal on September 
20, 2005. 

V. SUMMARY OF THE CLAIMED SUBJECT MATTER 

The claimed subject matter relates to isolated polypeptides related to the polypeptide 
having SEQ ID NO: 136. Independent Claims 6 and 14, read: 



6. 



An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 




203547. 



14. An isolated polypeptide having at least 95% amino acid sequence 
identity to: 
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(a) the amino acid sequence of the polypeptide of SEQ ID NO: 136; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 136, 
lacking its associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length 
coding sequence of the cDNA deposited under ATCC accession number 203547; 

and wherein said isolated polypeptide or a fragment thereof can be used to 
generate an antibody which can be used to specifically detect the polypeptide of 
SEQ ID NO: 136 in esophagus tissue samples. 

Various aspects of the claimed polypeptides are described in the specification at, for 
example, paragraphs [0001]-[0006], [0004]-[0026], [0161]-[0162], [0195]-[0208], [0221]-[0222], 
[0225], [0229], [0253]-[0284], [0336], [0361]-[0362], [0529]-[0530], Tables 6-8, and Figures 
135 and 136. SEQ ID NO: 136 is disclosed in the Sequence Listing appended to the application. 

VI. GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL AND GROUPING 
OF CLAIMS 

A. Grounds of Rejection on Appeal 

The Examiner has rejected Claims 6-8 and 11-17 under 35 U.S.C. §101, stating that the 
claimed invention is not supported by either a specific and substantial asserted utility or a well- 
established utility. Office Action at 4. 

The Examiner has rejected Claims 6-8 and 12-17 under 35 U.S.C. §112, first paragraph 
as lacking an enabling disclosure for percent variants of SEQ ID NO: 136, asserting that ""it 
would require undue experimentation for one skilled in the art to make and use the claimed 
genus of the molecules embraced by the instant claims." Id, at 14, 16, 

The Examiner has also rejected Claims 6-8 and 12-17 under 35 U.S.C. § 112, first 
paragraph, as lacking an adequate written description, stating that "Applicants were not in 
possession of all or a significant number of polypeptides that have 95-99% homology to SEQ ID 
NO: 136 and that still retain the function of SEQ ID NO: 136." Id at 16. 

The Examiner has rejected Claims 6-8 and 11-17 xmder 35 U.S.C. § 102(b) as being 
anticipated by Valenzuela et al (WO 00/55375), published September, 2000. Id, at 18. 
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B. Grouping of Claims 
L Utility Rejection - Claims d-S and 11-17 

For purposes of the utility rejection under 35 U.S.C. § 101, Claims 6-8 and 1 1-17 can be 
considered as a group. 

2. Enablement Rejection - Claims 6-8 and 12-17 

a. Group 1 - Claims 6-8, and 12-13 

For purposes of the enablement rejection under 35 U.S.C. § 112, first paragraph, Claims 
6-8 and 12-13 can be considered as a group. 

b. Group 2 - Claims 14 and 16-1 7 

For purposes of the enablement rejection under 35 U.S.C. § 112, first paragraph, Claims 

14 and 16-17 can be considered as a group. 

c. Group 3 - Claim 15 

For purposes of the enablement rejection under 35 U.S.C. § 112, first paragraph, Claim 

15 should be considered individually. 

J. Written Description Rejection - Claims 6 and 12-1 7 

a. Group 1 - Claims 6 and 12-13 

For purposes of the written description rejection under 35 U.S.C. § 1 12, first paragraph. 
Claims 6 and 12-13 can be considered as a group. 

b. Group 2 - Claims 14 and 16-1 7 

For purposes of the written description rejection under 35 U.S.C. § 112, first paragraph. 
Claims 14 and 16-17 can be considered as a group. 

c. Group 3 - Claim IS 

For purposes of the written description rejection xmder 35 U.S.C. § 112, first paragraph, 
Claim 15 should be considered individually. 

4. Anticipation Rejection - Claims 6-8 and 11-1 7 
a. Group 1 

For purposes of the anticipation rejection under 35 U.S.C. § 102(b), Claims 6-8 and 11- 
1 7 can be considered as a group. 
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VII. APPELLANTS^ ARGUMENT 

A. Summary of the Arguments 
L Utility Rejection 

The first issue before the Board is whether Appellants have asserted at least one "specific, 
substantial, and credible utility" for the claimed subject matter. See Examination Guidelines, 66 
Fed. Reg. 1092 (2001). Appellants have asserted that the claimed polypeptides related to the 
polypeptide of SEQ ID NO: 136 (the PR01926 polypeptide) are useful as diagnostic tools for 
cancer, particularly for esophageal cancer. This asserted utility is specific, substantial, and 
credible. 

Briefly stated, Appellants' asserted utility is based on the disclosure in Example 18 of the 
instant application that the gene encoding the PRO 1926 polypeptide is underexpressed by at least 
a factor of tv^o in the majority of esophageal tumors tested compared to normal esophageal tissue. 
It is well-established that gene expression is correlated with expression of the encoded 
polypeptide. Thus, one of skill in the art would be more likely than not to believe that the 
underexpression of the PRO 1926 gene in esophageal tumors compared to the normal tissue 
counterpart leads to a decreased level of PRO 1926 protein in these esophageal tumors compared 
to normal esophageal tissue. This differential expression of PRO 1926 mRNA and polypeptide is 
useful for distinguishing esophageal tumor tissue from its normal tissue counterpart. Therefore, 
the claimed polypeptides related to the PRO 1926 polypeptide have a specific, substantial and 
credible utility as diagnostic tools for cancer, particularly esophageal cancer, as is explained in 
more detail below. 

2. Enablement Rejection 

The second issue before the Board is whether Appellants have enabled the pending 

claims such that one of skill in the art would be able to make and use the claimed invention. The 

Examiner has rejected pending Claims 6-8 and 12-17 under 35 U.S.C. §112, first paragraph, 

arguing that the claimed subject matter was not described in the specification in such a way as to 

enable one skilled in the art to make and/or use the invention. The Examiner argues that even if 

the specification taught how to use the PRO 1926 polypeptide, enablement would not be 

commensurate in scope with claims which encompass percent variants and fragments of SEQ ID 

-5- 



Appl No. : 10/063,661 

Filed : May 7, 2002 



NO: 136 because there is no structural or functional information provided in the specification. 
Office Action at 14-15. 

Appellants submit that Claims 6-8 and 12-17 are enabled such that one of skill in the art 
could make and use the claimed polypeptides without undue experimentation. With respect to 
Claims 6-8, how to make the polypeptide of SEQ ID NO: 136 and the polypeptide encoded by the 
cDNA deposited under ATCC accession number 203547 is within the skill in the art. Similarly, 
with respect to Claims 12-17, it is well within the skill of those in the art to make polypeptides 
that are at least 95% identical to SEQ ID NO: 136 and the polypeptide encoded by ATCC 203547, 
and it is well within the knowledge of those skilled in the art how to make antibodies which are 
specific to a disclosed sequence. See also In re Wands, 858 F.2d 731 (reversing the Board's 
decision of non-enablement and holding that as of 1980, undue experimentation was not required 
to make high-affinity monoclonal antibodies to a target peptide). Thus, one of skill in the art 
would be able to make the claimed polypeptides without undue experimentation. 

Appellants assert that the claimed polypeptides are useful as diagnostic tools for cancer, 
particularly esophageal cancer. This use is based on the disclosure in Example 1 8 of the instant 
application that the nucleic acid encoding the PRO 1926 polypeptide is underexpressed at least 
two-fold in esophageal tumor compared to normal esophageal tissue. As detailed below, it is 
well-established that changes in the expression level of mRNA are correlated with changes in the 
expression level of the encoded polypeptide, and thus it is likely that the PRO 1926 polypeptide is 
underexpressed in esophageal tumors. Thus, based on the disclosure in the application, one of 
skill in the art would be able to use the claimed polypeptides as diagnostic tools to distinguish 
suspected esophageal tumors from normal esophageal tissue without undue experimentation. 

J. Written Description Rejection 

The third issue before the Board is whether the claimed subject matter is described in the 
specification in such a way as to reasonably convey to one skilled in the art that the inventors had 
possession of the claimed invention at the time the application was filed. The Examiner has 
rejected pending Claims 6 and 12-17 under 35 U.S.C. §112, first paragraph, as lacking an 
adequate written description, stating that "even a very skilled artisan could not envision the 
detailed chemical structure of all or a significant number of encompassed PRO 1926 polypeptides, 
and therefore, would not know how to make or use them," Id at 17. 
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The Examiner has the initial burden of rebutting the presumption that the written 
description is adequate. He has failed to meet this initial burden because nowhere in either 
Office Action does the Examiner address his arguments to Claim 6 or specifically to Claims 14- 
17. In addition, the generic arguments he has made are either flawed, or do not apply to Claims 
6 and 12-17. For the reasons detailed below, Appellants submit that Claims 6 and 12-17 are 
adequately described such that one of skill in the art would recognize that the inventors had 
possession of the claimed invention at the time the application was filed. 

4. 35 U.S.C. S 102(b) Rejection 

The fourth issue before the Board is whether pending Claims 6-8, and 1 1-17 are properly 
rejected under 35 U.S.C. § 102(b) as being anticipated by Valenzuela et a/.(WO 00/55375), 
which was published September, 2000. 

The instant application is a continuation of, and claims priority under 35 U.S.C. § 120 to, 
US Application 10/006867 filed 12/6/2001, which is a continuation of, and claims priority under 
35 U.S.C. § 120 to, PCT Application PCT/USOO/23328 filed 8/24/2000, which claims priority 
under 35 U.S.C. § 119 to US Provisional Application 60/170262 filed 12/9/1999. Appellants 
submit that for the reasons detailed below, the claimed polypeptides have a credible, substantial, 
and specific utiHty. The sequences of SEQ ID NOs:135 and 136 and the data in Example 18 
(Tumor Versus Normal Differential Tissue Expression Distribution) were both disclosed in PCT 
Application PCT/USOO/23328 filed 8/24/2000, and therefore the instant application is entitled to 
a priority date of at least August 24, 2000. Valenzuela was published September, 2000. Thus, 
Valenzuela was not published more than one year prior to the filing of Application 
PCT/USOO/23328 on August 24, 2000, and therefore Valenzuela cannot be cited as prior art 
against the instant application under 35 U.S.C. § 102(b). 

B. Utility Rejection - Detailed Arguments 

The first issue before the Board is whether Appellants have asserted at least one "specific, 
substantial, and credible utility." See Examination Guidelines, 66 Fed. Reg. 1092 (2001). 
Appellants have asserted that the claimed polypeptides related to the polypeptide of SEQ ID 
NO 136 (the PRO 1926 polypeptide) are useful as diagnostic tools for cancer, particularly for 
esophageal cancer. This asserted utility is specific, substantial, and credible. 
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7. Utility - Le2al Standard 

A "specific utility" is defined as utility which is "specific to the subject matter claimed/' 
in contrast to "a general utility that would be applicable to the broad class of the invention." See 
M.P.E.P, § 2107.01 1. For example, it is generally not enough to state that a nucleic acid is 
useful as a diagnostic tool without also identifying the condition that is to be diagnosed. 

The requirement of "substantial utility" defines a "real world" use, and derives from the 
Supreme Court's holding in Brenner v. Manson, 383 U.S. 519, 534 (1966) stating that "[t]he 
basic quid pro quo contemplated by the Constitution and the Congress for granting a patent 
monopoly is the benefit derived by the public from an invention with substantial utility." In 
explaining the "substantial utility" standard, M.P.E.P. § 2107.01 cautions, however, that Office 
personnel must be careful not to interpret the phrase "immediate benefit to the public" or similar 
formulations used in certain court decisions to mean that products or services based on the 
claimed invention must be "currently available" to the public in order to satisfy the utility 
requirement. "Rather, any reasonable use that an applicant has identified for the invention that 
can be viewed as providing a public benefit should be accepted as sufficient, at least with regard 
to defining a 'substantial' utility." M.P.E.P. § 2107.01 (emphasis added). 

Indeed, the Guidelines for Examination of Applications for Compliance With the Utility 
Requirement, set forth in M.P.E.P. § 2107 11(B)(1) gives the following instruction to patent 
examiners: "If the applicant has asserted that the claimed invention is useful for any particular 
practical purpose . . . and the assertion would be considered credible by a person of ordinary skill 
in the art, do not impose a rejection based on lack of utility." 

Finally, in assessing the credibility of the asserted utility, the M.P.E.P. states that "to 
overcome the presumption of truth that an assertion of utility by the applicant enjoys" the PTO 
must establish that it is "more likely than not that one of ordinary skill in the art would doubt (i.e., 
'question') the truth of the statement of utility." M.P.E.P. § 2107.02 III A. 

2. Utility - Burden of Proof 

It is well established that a specification which contains a disclosure of utility which 
corresponds in scope to the subject matter sought to be patented "must be taken as sufficient to 
satisfy the utility requirement of § 101 for the entire claimed subject matter unless there is reason 
for one skilled in the art to question the objective truth of the statement of utility or its scope." In 
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re Longer, 503 F.2d 1380, 1391, 183 U.S.P.Q. 288, 297 (C.C.P.A. 1974). Thus "the PTO has the 
initial burden of challenging a presumptively correct assertion of utility in the disclosure." In re 
Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995). Only after the PTO provides 
evidence showing that one of ordinary skill in the art would reasonably doubt the asserted utility 
does the burden shift to the applicant to provide rebuttal evidence sufficient to convince such a 
person of the invention's asserted utility. Id. 

3. Utility - Standard of Proof 

Compliance with 35 U.S.C. § 101 is a question of fact. Raytheon v. Roper, 724 F.2d 951, 

956, 220 U.S.P.Q. 592, 596 (Fed. Cir. 1983). The evidentiary standard to be used throughout ex 

parte examination in setting forth a rejection is a preponderance of the evidence, or "more likely 

than not" standard. In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d 1443, 1444 (Fed. Cir. 

1992). This is stated explicitly in the M.P.E.P.: 

[T]he applicant does not have to provide evidence sufficient to establish that an 
asserted utility is true "beyond a reasonable doubt." Nor must the applicant 
provide evidence such that it establishes an asserted utility as a matter of 
statistical certainty. Instead, evidence will be sufficient if, considered as a whole, 
it leads a person of ordinary skill in the art to conclude that the asserted utility is 
more likely than not true . M.P.E.P, § 2107.02, part VII (emphasis in original, 
citations omitted). 

The Court of Appeals for the Federal Circuit has stated that the standard for satisfying the 

utility requirement is a low one: 

The threshold of utility is not high : An invention is "useful" under section 101 if 
it is capable of providing some identifiable benefit. See Brenner v. Manson, 383 
U.S. 519, 534, 86 S.Ct. 1033, 16 L.Ed.2d 69 (1966); Brooktree Corp v. Advanced 
Micro Devices, Inc., 977 F.2d 1555, 1571 (Fed. Cir. 1992) ("To violate § 101 the 
claimed device must be totally incapable of achieving a usefiil result"); Fuller v. 
Berger, 120 F. 274, 275 (7th Cir. 1903) (test for utility is whether invention "is 
incapable of serving any beneficial end"). Juicy Whip, Inc. v. Orange Bang, Inc., 
185 F.3d 1364, 1366, 51 U.S.P.Q. 2d 1700 (Fed. Cir. 1999) (emphasis added). 

The low threshold for satisfying the utility requirement is reflected in the standard set by the 
Federal Circuit for invaHdating a patent based on a lack of utility: "[T]he fact that an invention 
has only limited utility and is only operable in certain applications is not grounds for finding lack 
of utility. Some degree of utility is sufficient for patentability. Further, the defense of non- 
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utility cannot be sustained without proof of total incapacity ." Envirotech Corp. v. Al George, 

Inc., 730 F.2d 753, 762, 221 U.S.P.Q. 473 (Fed. Cir. 1984) (emphasis added, citations omitted). 

Because the standard for satisfying the utility requirement is so low, requiring total 

incapacity for a finding of no utility, the M.P.E.P. cautions that: 

Rejections under 35 U.S.C. 101 have been rarely sustained by federal courts. 
Generally speaking, in these rare cases, the 35 U.S.C. 101 rejection was sustained 
[] because the applicant . . . asserted a utility that could only be true if it violated a 
scientific principle, such as the second law of thermodynamics, or a law of nature, 
or was wholly inconsistent with contemporary knowledge in the art. M.P.E.P. § 
2107.02 III B., citing In re Gazave, 379 F.2d 973, 978, 154 U.S.P.Q. 92, 96 
(C.C.P.A. 1967) (underline emphasis in original, italic emphasis added). 

4. Appellants Asserted a Specific, Substantial and Credible Utility that is 
Sufficient to Satisfy the Utility Requirement ofS 101 

The claimed subject matter is directed to polypeptides comprising the amino acid 
sequence of the polypeptide of SEQ ID NO: 136, the amino acid sequence of the polypeptide of 
SEQ ID NO: 136 lacking its associated signal peptide, or the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 203547. Additional claimed subject matter is directed to polypeptides having 
at least 95% amino acid sequence identity to the amino acid sequence of the polypeptide SEQ ID 
NO: 136, the amino acid sequence of the polypeptide of SEQ ID NO: 136 lacking its associated 
signal peptide, or the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203547; wherein the isolated 
polypeptide or a fragment thereof can be used to generate an antibody which can be used to 
specifically detect the polypeptide of SEQ ID NO: 136 in esophageal tissue samples. The 
polypeptide of SEQ ID NO:136 (referred to as "PR01926 polypeptide") is encoded by the 
polynucleotide of SEQ ID NO:135 (also referred to as DNA82340-2530). Specification at 
[0155-0156]. Appellants have asserted that the claimed polypeptides are useful as diagnostic 
tools for cancer, particularly esophageal cancer. 

In "Example 18: Tumor Versus Normal Differential Tissue Expression Distribution" 

Appellants disclose that the mRNA encoding PRO 1926 polypeptide is more highly expressed in 

normal esophageal tissue compared to esophageal tumor. Specification at [0529-0530] and 

accompanying tables. As explained in paragraph [0530], the differential expression of the 

PRO 1926 mRNA was detected using the well-established technique of quantitative PGR 
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amplification of cDNA libraries isolated from different human normal and tumor tissue samples. 
To ensure that equivalent amounts of nucleic acid were used in each reaction, the cDNA for p- 
actin was used as a control. 

The specification teaches that identification of the differential expression of a PRO 
polypeptide-encoding mRNA in one or more tumor tissues as compared to one or more normal 
tissues of the same tissue type "renders the molecule useful diagnostically for the determination 
of the presence or absence of tumor in a subject suspected of possessing a tumor." Specification 
at H [0530]. 

Appellants submit that because it is well established that changes in mRNA levels lead to 

changes in the level of the encoded protein, one would expect the PRO 1926 protein to be 

underexpressed in esophageal tumors. The specification states that PRO polypeptides "may also 

be used diagnostically for tissue typing, wherein the PRO polypeptides of the present invention 

may be differentially expressed in one tissue as compared to another, preferably in a diseased 

tissue as compared to a normal tissue of the same tissue type." Specification at | [0336]. The 

specification also discloses that PRO polypeptides and polypeptides related thereto can be used 

to generate anti-PRO antibodies. Id at Tf [0364] and ^ [0367]. The specification teaches that 

such antibodies to PRO polypeptides can be useful as diagnostic tools: 

[A]nti-PRO antibodies may be used in diagnostic assays for PRO [polypeptide], 
e.g., detecting its expression (and in some cases, differential expression) in 
specific cells, tissues, or serum. Various diagnostic assay techniques known in 
the art may be used, such as competitive binding assays, direct or indirect 
sandwich assays and immunoprecipitation assays conducted in either 
heterogeneous or homogeneous phases. Specification at [0407]. 

Taken together, the specification clearly discloses the use of the claimed polypeptides as 
diagnostic tools for cancer, particularly esophageal cancer. This utility is substantial, as one of 
skill in the art will recognize that the diagnosis of cancer is a "real world" use; it is specific, as 
the diagnosis of esophageal cancer is not a utility that applies to the broad class of antibodies; 
and it is credible, as it not a utility "that could only be true if it violated a scientific principle, 
...or a law of nature, or [is] wholly inconsistent with contemporary knowledge in the art." 
M.P.E.P. § 2107.02 III B., citing In re Gazave, 379 F.2d 973, 978, 154 U.S.P.Q. 92, 96 (C.C.P.A. 
1967). 

Because Appellants' specification contains a disclosure of utility which corresponds in 

scope to the claimed subject matter, the asserted utility "must be taken as sufficient to satisfy the 
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utility requirement of § 101 for the entire claimed subject matter xmless there is reason for one 
skilled in the art to question the objective truth of the statement of utility or its scope." In re 
Langer, 503 F.2d 1380, 1391, 183 U.S.P.Q. 288, 297 (C.CP.A. 1974). Therefore, the burden of 
establishing a prima facie case of lack of utility rests with the PTO. See^ In re Brana, 51 F.3d 
1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995) ("the PTO has the initial burden of challenging 
a presumptively correct assertion of utility in the disclosure"). 

5. The Examiner's Arsuments 

In the first Office Action, dated January 11, 2005, the Examiner rejected the pending 
claims, stating "Claims 1-13 are rejected under 35 U.S.C. 101 because the claimed invention is 
not supported by either a specific, substantial and credible asserted utility or a well established 
utility." First Office Action at 4. This rejection is maintained in the final Office Action mailed 
June 22, 2005. Final Office Action at 3-4. 

To establish a prima facie showing that the claimed subject matter lacks utility, the 
Examiner must "provide[] evidence showing that one of ordinary skill in the art would 
reasonably doubt the asserted utility." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 
(Fed. Cir. 1995). The Examiner has issued a first Office Action and a final Office Action during 
the prosecution of the instant application. Neither of these papers provides any evidence that one 
of ordinary skill in the art would reasonably doubt the asserted utility. 

As an initial matter, Appellants note that during the course of prosecution, the Examiner 
has made a number of irrelevant arguments regarding an asserted lack of correlation between 
gene amplification and an increase in gene expression, as well as the role of aneuploidy in cancer, 
citing references by Sen et al and Pennica et al These arguments are irrelevant for the reasons 
discussed in Appellants Amendment and Response filed on April 7, 2005. As it appears that the 
Examiner no longer relies on these arguments and references, they will not be addressed. See 
Final Office Action at 4 ("The Office acknowledges that the microarray experiments disclosed in 
the specification (example 18) does measure the level of mRNA expressed in tumor and normal 
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controls. Thus, the Office will not respond to Applicants arguments with respect to both Pennica 
et al. and Sen et al. references.").* The Examiner's remaining arguments are summarized below. 

The Examiner states that the specification discloses that the PRO 1926 polynucleotide is 
more highly expressed in normal esophageal tissue compared to esophageal tumor tissue 
counterparts, and that Applicants have asserted the use of the molecule for diagnosis. However, 
the Examiner rejects this utility, stating that "[t]here is no further supporting evidence to indicate 
that the polypeptide encoded by the polynucleotide of the instant invention is also differentially 
expressed in normal tissues compared to the tumor tissue and as such one of skill in the art would 
conclude that it is not supported by a substantial asserted utility or a well-established utility." 
Office Action at 4-5. The Examiner raises essentially two arguments to reject the asserted utility. 

First, the Examiner challenges the sufficiency of the data presented in Example 18. The 
Examiner argues that the evidence of differential expression of the PRO 1926 mRNA in 
esophageal tumors is insufficient because it does not teach what the normal level of expression is, 
it does not indicate how high the expression level is compared to the disease tissue, it lacks 
statistical correlation, there is no data to compare expression in the normal and disease samples, 
and that because the normal and tumor samples are not from the same person, there is no 
possibility of direct comparison between the normal and tumor samples. See First Office Action 
at 6-7. The Examiner also cites Hu et al (J. Proteome Res., (2003) 2(4):405-12) to support his 
assertion the literature cautions researchers from drawing conclusions based on small changes in 
transcript expression levels between normal and cancerous tissue. Final Office Action at 6, 9-10, 
and 10. 

Second, the Examiner argues that polypeptide levels cannot be accurately predicted from 
mRNA levels because the correlation between mRNA levels and protein levels is poor at best, 
citing Haynes et al (Electrophoresis, (1998) 19(1 1):1862-71), Chen et al (MoL and Cell. 
Proteomics, (2002) 1:304-313) and Gygi et al (Mol. and Cell. Bio., (1999) 19(3): 1720-30) for 
support. Final Office Action at 6-7, 1 1, 12 and 13-14. 

Based on these arguments, the Examiner concludes that "[fjurther research needs to be 
done to determine whether the increase of PRO 1926 cDNA in normal esophageal tissues 

' For the record. Appellants note that the experiments reported in Example 18 of 
the specification are not microarray experiments, but rather semi-quantitative PGR analysis of 
cDNA libraries. 
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compared to esophageal tumor tissues supports a role for the polypeptide in the cancerous tissue; 
such a role has not been suggested by the instant disclosure." Final Office Action at 5. The 
Examiner states that this further research requirement makes clear that the asserted utility is not 
substantial, and therefore the Appellants' invention is not complete. Id 

6. The Examiner has not established a Prima Facie case that Claims 6-8 and 11- 
17 lack Utility 

The above arguments do not satisfy the Examiner's burden to "provide [] evidence 
showing that one of ordinary skill in the art would reasonably doubt the asserted utility." In re 
Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995). The Examiner has the burden 
of presenting "countervailing facts and reasoning sufficient to establish that a person of ordinary 
skill would not believe the applicant's assertion of utility." M.P.E.P, at §2107.02 III.A., citing in 
re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995) ("Only after the PTO 
provides evidence showing that one of ordinary skill in the art would reasonably doubt the 
asserted utility does the burden shift to the applicant to provide rebuttal evidence") (emphasis 
added). The Examiner relies on the Hu et al, Haynes et al, Chen et al and Gygi et al 
references, to support his arguments. However, for the reasons discussed below, they do not 
support the Examiner's position. Therefore, the Examiner's assertions are not supported by any 
facts, evidence, or reasoning, and there is simply no evidence in the record to support the 
Examiner's arguments that Appellants' asserted utility is not substantial, and the invention is 
incomplete. Absent some substantial evidence to support his assertions, the Examiner has failed 
to establish a prima facie showing that one of skill in the art would reasonably doubt the asserted 
utility, and the Board should accept Appellants' disclosed utility as sufficient. 

a. The data in Example 18 are sufficient to establish the asserted utility 

Appellants turn first to the Examiner's arguments challenging the reliability of the data 
reported in Example 1 8. The Examiner argues that Example 1 8 and the first declaration of Mr. 
Grimaldi are insufficient to overcome the utility rejection of the pending claims because the 
specification does not teach what the normal level of expression is, it does not indicate how high 
the expression level is compared to the disease tissue, it lacks statistical correlation, there is no 
data to compare expression in the normal and disease samples, and that because the normal and 
tumor samples are not fi-om the same person, there is no possibility of direct comparison between 
the normal and tumor samples. See First Office Action at 6-7. The Examiner also argues that the 
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literature cautions researchers from drawing conclusions based on small changes in transcript 
expression levels between normal and cancerous tissue, citing Hu et al (J. Proteome Res., (2003) 
2(4):405-12). Final Office Action at 6, 9-10, 10 and 14. None of these unsupported arguments 
are sufficient to establish a prima facie case that one of skill in the art would reasonably doubt 
the asserted utility. 

Appellants note that the only objection by the Examiner to the data in Example 18 that is 
supported by any reasoning or evidence is the assertion based on the Hu et al reference that the 
literature cautions researchers from drawing conclusions based on small changes in transcript 
expression levels between normal and cancerous tissue The remainder of the objections are not 
supported by any evidence or reasoning as to why this makes the data in Example 1 8 insufficient, 
and therefore they cannot establish a prima facie case. See In re Brana, 51 F.3d 1560, 1566, 34 
U.S.P.Q.2d 1436 (Fed. Cir. 1995) ("Only after the PTO provides evidence showing that one of 
ordinary skill in the art would reasonably doubt the asserted utility does the burden shift to the 
applicant to provide rebuttal evidence.") (emphasis added). Appellants address the Examiner's 
arguments below. 

The gene expression data in Example 18 of the specification show that the mRNA 
associated with protein PRO 1926 is more highly expressed in normal esophageal tissue 
compared to esophageal tumor. See Specification at ^ [0530] and accompanying tables. Gene 
expression was analyzed using standard quantitative PGR amplification reactions of cDNA 
libraries isolated fi'om different human tumor and normal human tissue samples. Id. It is well 
known in the art that the number of copies of a particular cDNA in the cDNA library is 
determined by the number of copies of the corresponding mRNA in the sample. Therefore, the 
cDNA libraries can be used to determine the level of expression of the corresponding mRNA in 
the tissue. 

Appellants have asserted that identification of the differential expression of the PRO 1926 
polypeptide-encoding gene in tumor tissue compared to the corresponding normal tissue renders 
the molecule usefiil as a diagnostic tool for the determination of the presence or absence of tumor. 
Id In support of this asserted utility, Appellants submitted as Exhibit 2 to their Amendment and 
Response to Office Action, a first Declaration of J. Christopher Grimaldi, an expert in the field 
of cancer biology. This declaration explains the importance of the data in Example 18, and how 
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differential gene and protein expression studies are used to differentiate between normal and 
tumor tissue. See First Grimaldi Declaration. 

In paragraphs 6 and 7, Mr. Grimaldi explains that the semi-quantitative analysis 
employed to generate the data of Example 1 8 is sufficient to determine if a gene is over- or 
under-expressed in tumor cells compared to corresponding normal tissue. He states that any 
visually detectable difference seen between two samples is indicative of at least a two-fold 
difference in cDNA between the tumor tissue and the counterpart normal tissue. He also states 
that the results of the gene expression studies indicate that the genes of interest "can be used to 
differentiate tumor from normal." He explains that, contrary to the PTO's assertions, "[t]he 
precise levels of gene expression are irrelevant; what matters is that there is a relative difference 
in expression between normal tissue and tumor tissue." First Grimaldi Declaration at ^ 7. 

This declaration makes clear that since it is the relative level of expression between 
normal tissue and suspected cancerous tissue that is important, how high the level of expression 
in normal tissue is, is irrelevant. As to the Examiner's questions about the reliability and 
reproducibility of the results. Appellants employed standard techniques which are well-known 
and accepted by those of skill in the art. The Grimaldi Declaration states that if a difference is 
detected using these techniques, "this indicates that the gene and its corresponding polypeptide 
and antibodies against the polypeptide are useful for diagnostic purposes..." Id. Thus, it is the 
uncontested opinion of an expert in the field that the results are reliable enough to indicate that 
the claimed polypeptides are useful as diagnostic tools. As to the Examiner's concems regarding 
the number and types of samples used, the Grimaldi Declaration states that the samples are 
pooled samples of normal and tumor tissue, and therefore are more reliable than individual 
samples. Id. at ^ 5. 

The Examiner has also rejected the data because he questions the statistical significance 
of the data. However, Appellants are not required to prove utility to a statistical certainty, only 
that it is more likely than not true. See Nelson v. Bowler, 626 F.2d 853, 856-57, 206 U.S.P.Q. 
881, 883-84 (C.C.P.A. 1980) (reversing the Board and rejecting an argument that evidence of 
utility was insufficient because it was not statistically significant). Therefore, whether the results 
are statistically significant or not is irrelevant to establishing the asserted utility. 

The Examiner has also cited Hu et al (J. Proteome Res., (2003) 2(4):405-12) for support 
for its assertion the literature cautions researchers from drawing conclusions based on small 
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changes in transcript expression levels between normal and cancerous tissue. The PTO states 
that Hu teaches that not all genes with increased expression in cancer have a known or published 
role in cancer. 

In Hu, the researchers used an automated literature-mining tool to summarize and 
estimate the relative strengths of all human gene-disease relationships published on Medline. 
They then generated a microarray expression dataset comparing breast cancer and normal breast 
tissue. Using their data-mining tool, they looked for a correlation between the strength of the 
literature association between the gene and breast cancer, and the magnitude of the difference in 
expression level. They report that for genes displaying a 5-fold change or less in tumors 
compared to normal, there was no evidence of a correlation between altered gene expression and 
a known role in the disease. See Hu at 411. However, among genes with a 10-fold or more 
change in expression level, there was a strong correlation between expression level and a 
published role in the disease. Id at 412. Importantly, Hu reports that the observed correlation 
was only found among estrogen receptor-positive tumors, not ER-negative tumors. Id. 

The general findings of Hu are not surprising - one would expect that genes with the 
greatest change in expression in a disease would be the first targets of research, and therefore 
have the strongest known relationship to the disease as measured by the number of publications 
reporting a connection with the disease. The correlation reported in Hu only indicates that the 
greater the change in expression level, the more likely it is that there is a published or known role 
for the gene in the disease, as found by their automated literature-mining software. Thus, Hu's 
results merely reflect a bias in the literature toward studying the most prominent targets, and 
reflect nothing regarding the ability of a gene that is 2-fold or more differentially expressed in 
tumors to serve as a disease marker. 

Hu acknowledges the shortcomings of this method in explaining the disparity in Hu's 
findings for ER-negative versus ER-positive tumors: Hu attributes the ''bias in the literature" 
toward the more prevalent ER-positive tumors as the explanation for the lack of any correlation 
between number of publications and gene expression levels in less-prevalent (and, therefore, less 
studied) ER-negative tumors. Id, Because of this intrinsic bias, Hu's methodology is unlikely to 
ever note a correlation of a disease with less differentially-expressed genes and their 
corresponding proteins, regardless of whether or not an actual relationship between the disease 
and less differentially-expressed genes exists. Accordingly, Hu's methodology yields results that 
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provide little or no information regarding biological significance of genes with less than 5-fold 
expression change in disease. Nowhere in Hu does it say that a lack of correlation in their study 
means that genes with a less than five-fold change in level of expression in cancer cannot serve 
as a molecular marker of cancer. 

Appellants submit that a lack of known role for the PRO 1926 gene in cancer does not 
prevent its use as a diagnostic tool for cancer. There is a difference between use of a gene for 
distinguishing between tumor and normal tissue on the one hand, and establishing a role for the 
gene in cancer on the other. Genes with lower levels of change in expression may or may not be 
the most important genes in causing the disease, but the genes can still show a consistent and 
measurable change in expression. While such genes may or may not be good targets for further 
research, they can nonetheless be used as diagnostic tools. Thus, Hu does not refute the 
Appellants' assertion that the PRO 1926 gene can be used as a cancer diagnostic tool because it is 
differentially expressed in certain tumors. 

Contrary to the Examiner's assertion that one must know what role a gene or polypeptide 
plays in cancer for it to have utility, the PTO's own written policies recognize that the utility of a 
nucleic acid does not depend on the function of the encoded gene product. The Utility 
Examination Guidelines published on January 5, 2001 state: "In addition, the utility of a claimed 
DNA does not necessarily depend on the function of the encoded gene product. A claimed DNA 
may have a specific and substantial utility because, e.g. it hybridizes near a disease-associated 
gene or it has a gene regulating activity." (Federal Register, Volume 66, page 1095, Comment 
14). Similarly, here the disclosed nucleic acids, as well as the encoded polypeptides and related 
antibodies, are useful for determining whether an individual has cancer regardless of whether or 
not they are the cause of the cancer. 

The position of the Examiner requiring a known role for PRO 1926 in cancer for utility is 
also inconsistent with the analogous standard for therapeutic utility of a compound where "the 
mere identification of a pharmacological activity of a compound that is relevant to an asserted 
pharmacological use provides an 'immediate benefit to the public' and thus satisfies the utility 
requirement." M.P.E.P. §2701.01 (emphasis original). Here, the mere identification of altered 
expression in tumors is relevant to diagnosis of tumors, and, therefore, provides an immediate 
benefit to the public. 
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The data in Example 18 and the first Grimaldi Declaration are therefore sufficient to 
establish the asserted utility, and the Examiner has not rebutted the presumption of utility that the 
Appellants' application is afforded. Mr. Grimaldi is an expert in the field who conducted or 
supervised the experiments at issue. His declaration is based on personal knowledge of the 
relevant facts at issue. Appellants' have reminded the Examiner that "Office personnel must 
accept an opinion from a qualified expert that is based upon relevant facts whose accuracy is not 
being questioned." M.P.E.P, § 2107 (emphasis added). In addition, declarations relating to 
issues of fact should not be summarily dismissed as "opinions" without an adequate explanation 
of how the declaration fails to rebut the Examiner's position. See in re Alton 76 F.3d 1 168 (Fed. 
Cir. 1996). The Examiner has offered no reason or evidence to reject either the underlying data 
or Mr. Grimaldi's conclusions. Therefore, the Examiner should accept Mr. Grimaldi's opinion 
with regard to his statement that "any visually detectable difference seen between two samples is 
indicative of at least a two-fold difference in cDNA between the tumor tissue and the counterpart 
normal tissue" and that the genes of interest "can be used to differentiate tumor from normal." 

In conclusion. Appellants submit that the evidence reported in Example 18, supported by 
the first Grimaldi Declaration, establish that there is at least a two-fold difference in PRO 1926 
mRNA between esophageal tumor tissue and normal esophageal tissue. Therefore, it follows 
that the PRO 1926 gene, polypeptide, and antibody can be used to distinguish esophageal tumor 
tissue from its normal tissue counterpart. The Examiner has not offered any significant 
arguments or evidence to the contrary, and therefore has not established a prima facie case that 
one of skill in the art would reasonably doubt the asserted utility. 

b. The three references cited by the Examiner do not refute Appellants^ 
assertion that a chanse in mRNA levels leads to a correspondins chanse 
in the level of the encoded protein 
Appellants turn next to the Examiner's second argument that polypeptide levels cannot be 
accurately predicted from mRNA levels because the correlation between mRNA levels and 
protein levels is poor at best. Final Office Action at 6-7, 11, 12 and 13-14. For support, the 
Examiner cites three references, Haynes et al (Electrophoresis, (1998) 19(1 1): 1862-71), Chen et 
al (Mol. and Cell. Proteomics, (2002) 1:304-313) and Gygi et al (Mol. and Cell. Bio., (1999) 
19(3): 1720-30). Based on these references, the Examiner concludes that "it is clear that one 
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skilled in the art would not assume that a more highly expressed mRNA would directly correlate 
with increased polypeptide levels." Final Office Action at 5. For the reasons discussed below, 
none of these references are contrary to Appellants' assertion that generally speaking, changes in 
mRNA levels lead to corresponding changes in the level of polypeptide. 

Haynes studied whether there is a correlation between the level of mRNA expression and 
the level of protein expression for 80 selected genes from yeast. The genes were selected 
because they constituted a relatively homogeneous group with respect to predicted half-life and 
expression level of the protein products. See Haynes at 1863. Haynes did not examine whether 
a change in transcript level for a particular gene led to a change in the level of expression of the 
corresponding protein. Instead, Haynes determined whether the steady-state transcript level 
correlated with the steady-state level of the corresponding protein based on an analysis of 80 
different genes. 

Haynes reported to have "found a general trend but no strong correlation between protein 
and transcript levels." The Examiner focuses on the portion of Haynes where the authors 
reported that for some of the studied genes with equivalent mRNA levels, there were differences 
in corresponding protein expression, including some that varied by more than 50-fold. Final 
Office Action at 7. Similarly, Haynes reports that different proteins with similar expression 
levels were maintained by transcript levels that varied by as much as 40-fold. Thus, Haynes 
showed that for one type of yeast, similar mRNA levels for different genes did not universally 
result in equivalent protein levels for the different gene products, and similar protein levels for 
different gene products did not universally result from equivalent mRNA levels for the different 
genes. These results are expected, since there are many factors that determine translation 
efficiency for a given transcript, or the half-life of the encoded protein. Not surprisingly, based 
on these results, Haynes concluded that protein levels cannot always be accurately predicted 
from the level of the corresponding mRNA transcript when looking at the level of transcripts 
across different genes . 

Importantly, Haynes did not say that for a single gene, changes in the level of mRNA 
transcript are not positively correlated with changes in the level of protein expression. 
Appellants have asserted that increasing or decreasing the level of mRNA for the same gene 
leads to a increase or decrease for the corresponding protein. Haynes did not study this issue and 
says absolutely nothing about it. One cannot look at the level of mRNA across several different 
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genes to investigate whether a change in the level of mRNA a particular gene leads to a change 
in the level of protein for that gene. Therefore, Haynes is not inconsistent with or contradictory 
to the utility of the instant claims, and offers no support for the Examiner's rejection of 
Appellants' asserted utility. 

The Examiner also relies on Gygi et al, a study on which the Haynes reference is based. 
Like Haynes, the Gygi reference looked at static levels of mRNA across different genes, not 
changes in mRNA levels for a single gene. Therefore, for the same reasons that Haynes is not 
relevant to Appellants' asserted utility, Gygi likewise offers no support for the Examiner's 
rejection of Appellants' asserted utility. 

Appellants turn next to the Chen et al reference, where the authors examined the 
relationship between mRNA levels and protein levels in 76 lung adenocarcinomas and nine non- 
tumor lung samples. 

As an initial matter, it is important to note that a portion of Chen is not relevant to 
Appellants' assertion that changes in the level of mRNA lead to changes in the level of the 
encoded polypeptide. In one experiment similar to that of Haynes, Chen examined the global 
relationship between mRNA and the corresponding protein abundance by calculating the average 
mRNA and protein level of all the samples for each gene or protein, and then looked for a 
correlation across different genes. Based on these data, Chen reported that "no significant 
correlation between mRNA and protein expression was found (r = -0.025) if the average levels of 
mRNA or protein among all samples were applied across the 165 protein spots (98 genes)." 
Chen at Abstract. This measurement of a correlation across different genes is not relevant to 
Appellants' asserted utility for the same reasons discussed above with respect to the Haynes et al 
and Gygi et al references. 

Chen also looked at the level of mRNA of 98 individual genes and their corresponding 
proteins across the samples. Chen reports that 17% (28 of 165) of the protein spots, or 21.4% 
(21 of 98) of the genes, showed a statistically significant correlation between protein and mRNA 
expression. Chen at Abstract. It is these results that the Examiner relies on for support. 

However, read in its entirety, Chen provides scant evidence to counter Appellants' 
asserted utility because portions of Chen support Appellants' assertions, and the remaining 
portions provide little insight into the relationship between changes in mRNA levels and changes 
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in the corresponding protein levels for mRNA that is differentially expressed in tumor cells 
relative to normal cells. 

Appellants have asserted that changes in mRNA levels, particularly those which are two- 
fold or greater, will correspond with measurable changes in polypeptide expression. The data in 
Chen support Appellants' assertion. In Figures 2A-2C, Chen plots mRNA value vs. protein 
value for three genes. In these figures, a wide range of mRNA expression levels were observed 
(approximately seven- to eight-fold), and a correlation between mRNA and protein levels was 
observed for all three mRNA/protein pairs. This supports Appellants' assertion that there is a 
correlation between changes in mRNA levels which are two-fold or greater and changes in 
polypeptide expression. 

The Examiner relies on the fact that Chen also reports a lack of correlation for some 
mRNA/protein pairs to support his assertion that polypeptide levels cannot be accurately 
predicted from mRNA levels. However, as is explained below, the apparent lack of a correlation 
cannot be used as evidence that Appellants' assertion of a general correlation is wrong. 

To determine if there is a correlation between changes in mRNA and changes in protein 
levels, one would have to conduct experiments where a measurable change in mRNA for a 
particular gene is observed, and then examine if there was a corresponding change in the level of 
the corresponding protein. Stated differently, if there is no substantial change in mRNA levels 
for a particular gene, one cannot measure a correlation between changes in mRNA and changes 
in the encoded protein for that gene. Therefore, one must know if the individual genes studied 
by Chen were differentially expressed to know if the observed lack of correlation has any 
relevance to Appellants' assertions of a general correlation between changes in mRNA and 
protein. 

Importantly, unlike Appellants, Chen did not examine differences in mRNA between 
tumor and normal tissue where one would expect to find substantial changes in the level of 
mRNA for certain genes. Instead, Chen merely selected proteins whose identity could be 
determined regardless of any changes in expression level . Chen at 306, right colunm. Therefore, 
it is not known if there was any substantial difference in mRNA levels for the various studied 
genes across samples - in short, with the exception of the genes in Figures 2A-2C, it is not 
known if the genes examined were differentially expressed. Also of significance for Appellants' 
asserted utility is the fact that Chen did not attempt to examine any differential expression 
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between the cancerous lung samples and the non-cancerous lung samples - Chen did not 
distinguish between cancer and normal samples in their analysis. Since almost all samples tested 
by Chen were from the same type of tissue, one would expect most genes examined by Chen to 
have similar mRNA or protein levels across the samples. In the absence of substantial 
differential expression, no correlation would be observed. Because it is not known if there was a 
change in the level of the genes studied by Chen, le, whether they were differentially expressed, 
the lack of an observed correlation cannot be used to counter Appellants' assertion. 

In sum, the only data reported by Chen which shows substantial changes in the 
expression of mRNA, Figures 2A-C, confirms Appellants' assertion that substantial changes in 
mRNA levels (e.g., 2-fold or greater) will correspond to substantial changes in polypeptide 
expression. Further, these data explain the lack of observed correlation between mRNA levels 
and protein levels for other genes reported by Chen - there is no indication the genes are 
differentially expressed. Thus, Chen's results do not refute Appellants' position. Instead, Chen 
supports Appellants' position that a significant correlation between changes in mRNA and 
protein levels exists for changes in mRNA levels that are 2-fold or greater. 

In further support of Appellants' position, Chen cites Cells et al (FEBS Lett., 480:2-16 
(2000)) stating that the authors "found a good correlation between transcript and protein levels 
among 40 well resolved, abundant proteins using a proteomic and microarray study of bladder 
cancer." Chen at 311, first column (emphasis added). As mentioned above, the lack of a 
correlation across genes is not relevant to Appellants' asserted utility, and therefore Chen's 
discussion of this issue and citation of Anderson and Seilhamer (Electrophoresis, 18:533-37 
(1997)) and Gygi et al (Mol. Cell. Bio., 19:1720-30 (1999)) offer no support for the Examiner's 
position. 

Given the fact that portions of Chen as well as the relevant references cited by Chen 
support Appellants' position, and the remainder of Chen cannot be relied on as contrary to the 
Appellants' position, the Examiner has failed to establish a prima facie case that one of skill in 
the art would doubt Appellants' asserted utility based on any lack of correlation between changes 
in mRNA level and changes in the corresponding protein level. 
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c. Conclusion - The Examiner has failed to establish a prima facie case 
that one of skill in the art would doubt Appellants' asserted utility 

The Examiner has reHed on essentially two arguments in rejecting the pending claims for 
lack of utility. First, the Examiner has previously questioned the sufficiency, reliability and 
significance of the data reported in Example 18 as well as the supporting first Grimaldi 
declaration. The Examiner has argued that absent some known translocation or mutation of 
PRO 1926, or some role for PRO 1926 in cancer formation or development, the disclosure is 
insufficient. Second, the Examiner relies on Haynes et al , Gygi et al and Chen et al to support 
the assertion that polypeptide levels cannot be accurately predicted from mRNA levels. 
Appellants have responded to each of these arguments in turn. 

First, Appellants have shown that the data in Example 18 are sufficient to show that 
PRO 1928 is useful as a cancer diagnostic tool. This assertion is supported by the first Grimaldi 
declaration. The Examiner has not provided any substantial reason or evidence for one of skill in 
the art to doubt the reliability or usefulness of Example 18, or the facts and conclusions in the 
first Grimaldi declaration, and has accepted the data reported in Example 18 in a related 
application. 

Second, Appellants have shown the Haynes and Gygi references are simply not relevant 
to the issue of whether a change in mRNA levels leads to a corresponding change in the level of 
the encoded protein. Appellants have also shown that portions of Chen et aL, as well as some of 
the references cited by Chen, actually support Appellants assertion that changes in mRNA levels 
generally correlate with changes in the level of the encoded polypeptide. The remainder of Chen 
is not reliable enough to offer any support for the Examiner's position. 

Taken together, the Examiner's arguments are not sufficient to satisfy the Examiner's 
burden to "provide[] evidence showing that one of ordinary skill in the art would reasonably 
doubt the asserted utility." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 
1995). The Examiner's arguments are largely conclusory statements which are not supported by 
any substantial evidence or reasoning which explains why one of ordinary skill in the art would 
reasonably doubt the asserted utility. Therefore, the Board should accept the Appellants' 
disclosure of utility. See Ex parte Rubin, 5 U.S.P.Q. 2d 1461 (Bd. Pat. App. & Interf. 1987) 
("There is no factual support in this record for the examiner's questioning of the denaturation test 
reported in the specification. ... No reason to doubt 'the objective truth' of the asserted utility 
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having been advanced by the examiner, we accept appellant's disclosure of utility corresponding 

in scope to the claimed subject matter."). 

7. Appellants have provided Sufficient Rebuttal Evidence of Utility 

"Only after the PTO provides evidence showing that one of ordinary skill in the art would 

reasonably doubt the asserted utility does the burden shift to the applicant to provide rebuttal 

evidence." In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995). The rebuttal 

evidence must be sufficient such that when it is considered as a whole, it is more likely than not 

that the asserted utility is true. See In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d 1443, 

1444 (Fed. Cir. 1992) (stating that the evidentiary standard to be used throughout ex parte 

examination in setting forth a rejection is a preponderance of the evidence, or "more likely than 

not" standard). The M.P.E.P. summarizes the standard of proof required: 

[T]he applicant does not have to provide evidence sufficient to establish that an 
asserted utility is true "beyond a reasonable doubt." Nor must the applicant 
provide evidence such that it establishes an asserted utility as a matter of 
statistical certainty. Instead, evidence will be sufficient if, considered as a whole, 
it leads a person of ordinary skill in the art to conclude that the asserted utility is 
more likely than not true . M.P.E.P. § 2107.02, part VII (emphasis in original, 
citations omitted). 

Appellants remind the Board that the Federal Circuit has stated that the standard for satisfying 
the utility requirement is a low one: "The threshold of utility is not high: An invention is 'usefiil' 
under section 101 if it is capable of providing some identifiable benefit." Juicy Whip, Inc. v. 
Orange Bang, Inc., 185 F.3d 1364, 1366, 51 U.S.P.Q. 2d 1700 (Fed. Cir. 1999). 

Even if the Examiner has satisfied his burden of presenting a prima facie case of lack of 
utility, Appellants have supplied more than enough rebuttal evidence, such that when considered 
as a whole, one of skill in the art would conclude that the asserted utility is more likely than not 
true. As discussed in detail below, Appellants have provided sufficient evidence that the gene 
encoding the PRO 1926 polypeptide is differentially expressed in esophageal tumors and can 
therefore be used as a diagnostic tool. In addition. Appellants have shown that it is well 
established in the art that there is a reasonable correlation between changes in mRNA level and 
changes in the corresponding protein level such that one of skill in the art would believe that the 
PRO 1926 polypeptide is also differentially expressed in certain cancers. Therefore, considering 
the evidence as a whole, one of skill in the art would believe that it is more likely than not that 
the claimed polypeptides are usefiil as diagnostic tools for cancer, particularly esophageal cancer. 
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a. Appellants have established that the sene encoding the PRO 1926 
polypeptide is differentially expressed in certain cancers 
As discussed above, the Examiner has not provided any evidence or reasoning to 
challenge the reliability and significance of the data in Example 18 which reports that the mRNA 
for PRO 1926 is more highly expressed in normal esophageal tissue compared to esophageal 
tumor. In contrast to this complete lack of evidence on the part of the Examiner, Appellants 
have submitted the first Grimaldi declaration. That declaration establishes that it is the opinion 
of an expert in the field who has personal knowledge of the facts surrounding Example 18 that 
there is at least a two-fold difference in mRNA for PRO 1926 between the tumor tissue and the 
counterpart normal tissue, and that the PRO 1926 genes, polypeptides and antibodies are useful 
for differentiating tumor tissue from normal tissue. The Examiner has not provided any evidence 
or reasoning to challenge the facts and conclusions of the first Grimaldi declaration in support of 
Example 18. 

Given the disclosure of Example 18 and the supporting first Grimaldi declaration on the 
one hand, and the complete lack of any evidence on the other, it is clear that considering the 
evidence as a whole, one of skill in the art would conclude that it is more likely than not that the 
PRO 1926 gene is differentially expressed in esophageal tumor tissue compared to its normal 
tissue counterpart such that it is useful as a diagnostic tool to distinguish tumor tissue from 
normal tissue. 

As Appellants explain below, it is more likely than not that the PRO 1926 polypeptide is 
also differentially expressed in esophageal tumor tissue, and can therefore be used to distinguish 
tumor tissue from normal tissue. 

b. Appellants haye established that senerally there is a correlation between 
chanses in mRNA expression levels and chanses in the expression level 
of the encoded protein 
Appellants next turn to the second portion of their argument in support of their asserted 
utility - that it is well-established in the art that in most cases a change in the level of mRNA for 
a particular protein leads to a corresponding change in the level of the encoded protein. Given 
Appellants' evidence of differential expression of the mRNA for the PRO 1926 polypeptide in 
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esophageal tumor, it is more likely than not that the PRO 1926 polypeptide is likewise 
differentially expressed, and therefore the claimed polypeptides are useful as diagnostic tools, 
particularly for esophageal tumor. 

In support of the assertion that changes in mRNA are positively correlated to changes in 
protein levels, Appellants submitted a second Declaration by J. Christopher Grimaldi, an expert 
in the field of cancer biology (originally submitted as Exhibit 5 with the Appellants' Amendment 
and Response to Office Action). As stated in paragraph 5 of the declaration, "Those who work 
in this field are well aware that in the vast majority of cases, when a gene is over-expressed. . .the 
gene product or polypeptide will also be over-expressed.... This same principal applies to gene 
under-expression." Second Grimaldi Declaration at Tl 5. Further, "increased mRNA expression 
is expected to result in increased polypeptide expression, and the detection of decreased mRNA 
expression is expected to result in decreased polypeptide expression." Id. 

Appellants also submitted the declaration of Paul Polakis, Ph.D. an expert in the field of 

cancer biology (attached as Exhibit 6 to Appellants' Amendment and Response to Office Action). 

As stated in paragraph 6 of his declaration: 

Based on my own experience accumulated in more than 20 years of research, 
including the data discussed in paragraphs 4 and 5 above [showing a positive 
correlation between mRNA levels and encoded protein levels in the vast majority 
of cases studied in relation to the present invention] and my knowledge of the 
relevant scientific literature, it is my considered scientific opinion that for human 
genes, an increased level of mRNA in a tumor cell relative to a normal cell 
typically correlates to a similar increase in abundance of the encoded protein in 
the tumor cell relative to the normal cell. In fact, it remains a central dogma in 
molecular biology that increased mRNA levels are predictive of corresponding 
increased levels of the encoded proteia Polakis Declaration at T| 6 (emphasis 
added). 

Dr. Polakis acknowledges that there are published cases where such a correlation does not exist, 
but states that it is his opinion, based on over 20 years of scientific research, that "such reports 
are exceptions to the commonly understood general rule that increased mRNA levels are 
predictive of corresponding increased levels of the encoded protein." Polakis Declaration at ^ 6. 

The statements of Grimaldi and Polakis are supported by the teachings in Molecular 
Biology of the Cell, a leading textbook in the field (Bruce Alberts, et al. Molecular Biology of 
the Cell (3*^*^ ed. 1994) (submitted with Appellants' Amendment and Response to Office Action 
as Exhibit 7, hereinafter "Cell 3'"^") and (4^'' ed. 2002) (submitted with Appellants' Amendment 
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and Response to Office Action as Exhibit 8, hereinafter "Cell 4^*^")). Figure 9-2 of Cell 3'"* shows 
the steps at which eukaryotic gene expression can be controlled. The first step depicted is 
transcriptional control. Cell 3^^ provides that "[f|or most genes transcriptional controls are 
paramount. This makes sense because, of all the possible control points illustrated in Figure 9-2, 
only transcriptional control ensures that no superfluous intermediates are synthesized." Cell 3''^ 
at 403 (emphasis added). In addition, the text states that "Although controls on the initiation of 
gene transcription are the predominant form of regulation for most genes , other controls can act 
later in the pathway fi-om RNA to protein to modulate the amount of gene product that is made." 
Cell 3^^ at 453 (emphasis added). Thus, as established in Cell 3^^, the predominant mechanism 
for regulating the amount of protein produced is by regulating transcription. 

In Cell 4^*^, Figure 6-3 on page 302 illustrates the basic principle that there is a correlation 
between increased gene expression and increased protein expression. The accompanying text 
states that "a cell can change (or regulate) the expression of each of its genes according to the 
needs of the moment - most obviously by controlling the production of its mRNA'' Cell 4 at 
302 (emphasis added). Similarly, Figure 6-90 on page 364 of Cell 4* illustrates the path from 
gene to protein. The accompanying text states that while potentially each step can be regulated 
by the cell, " the initiation of transcription is the most common point for a cell to regulate the 
expression of each of its genes ." Cell 4^^ at 364 (emphasis added). This point is repeated on 
page 379, where the authors state that of all the possible points for regulating protein expression, 
" [f|or most genes transcriptional controls are paramount ." Cell 4^^ at 379 (emphasis added). 

Further support for Appellants' position can be found in the textbook, Genes VI, 
(Benjamin Lewin, Genes VI (1997)) (submitted with Appellants' Amendment and Response to 
Office Action as Exhibit 9) which states "having acknowledged that control of gene expression 
can occur at multiple stages, and that production of RNA cannot inevitably be equated with 
production of protein, it is clear that the overwhelming majority of regulatory events occur at the 
initiation of transcription ." Genes VI at 847-848 (emphasis added). 

Additional support is also found in Zhigang et al,. World Journal of Surgical Oncology 
2:13, 2004 (submitted with Appellants' Amendment and Response to Office Action as Exhibit 
10). Zhigang studied the expression of prostate stem cell antigen (PSCA) protein and mRNA to 
validate it as a potential molecular target for diagnosis and treatment of human prostate cancer. 
The data showed "a high degree of correlation between PSCA protein and mRNA expression" 
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Zhigang at 4. Of the samples tested, 81 out of 87 showed a high degree of correlation between 
mRNA expression and protein expression. The authors conclude that "it is demonstrated that 
PSCA protein and mRNA overexpressed in human prostate cancer, and that the increased protein 
level of PSCA was resulted from the upregulated transcription of its mRNA," Id at 6. Even 
though the correlation between mRNA expression and protein expression occurred in 93% of the 
samples tested, not 100%, the authors state that "PSCA may be a promising molecular marker 
for the clinical prognosis of human Pea and a valuable target for diagnosis and therapy of this 
tumor." Id at 7. 

Further, Meric et al. Molecular Cancer Therapeutics, vol. 1, 971-979 (2002), (submitted 

with Appellants' Amendment and Response to Office Action as Exhibit 1 1), states the following: 

The fundamental principle of molecular therapeutics in cancer is to exploit the 
differences in gene expression between cancer cells and normal cells... [M]ost 
efforts have concentrated on identifying differences in gene expression at the 
level of mRNA, which can be attributable to either DNA amplification or to 
differences in transcription. Meric et al at 971 (emphasis added). 

Exploiting differences in gene expression between cancer cells and normal cells would not be a 
"fundamental principle" of molecular cancer therapeutics if there were no significant correlation 
between gene expression and protein levels. Stated another way, changes in mRNA without 
corresponding changes in protein levels would have little or no effect on cellular biology, and 
those of skill in the art would have no reason to examine the differences in gene expression at the 
mRNA level without such a correlation. However, as one of skill in the art recognizes, there is a 
strong correlation between changes in mRNA and changes in protein level. It is because of this 
strong correlation that it remains a "fundamental principle" of molecular therapeutics in cancer 
to look at changes in mRNA level. 

Together, the declarations of Grimaldi and Polakis, the accompanying references, and the 
excerpts and references discussed above all establish that the accepted understanding in the art is 
that there is a reasonable correlation between changes in gene expression and changes in the 
level of the encoded protein. In contrast to this substantial amount of evidence supporting 
Appellants' position, the Examiner has cited three references, Haynes et al, Gygi et al and Chen 
et al However, as discussed above, Haynes and Gygi are not relevant to the issue of whether a 
change in mRNA levels leads to a change in the level of the corresponding protein since they 
examined only static levels of mRNA across different genes. Likewise, portions of Chen and the 
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relevant references cited by Chen actually support A ppellants' position, and the remainder of 
Chen is inconclusive. It is clear that when considered as a whole, the preponderance of the 
evidence clearly weighs in favor of Appellants. 

Appellants have presented sufficient evidence to establish that the mRNA for PRO 1926 
is differentially expressed in esophageal tumors compared to its normal tissue counterpart, and 
that it is more likely than not that this leads to differential expression of the PRO 1926 
polypeptide. This makes the claimed polypeptides related to the PRO 1926 polypeptide useful 
for diagnosing cancer, particularly esophageal tumors. Given the overwhelming amount of 
evidence in support of Appellants' position, and the near absence of any evidence in support of 
the Examiner's position, when considered as a whole the evidence leads a person of ordinary 
skill in the art to conclude that the asserted utility is more likely than not true. 

c. The asserted utility is specific 

Finally, Appellants address the PTO's assertion that the asserted utilities are not specific 
to the claimed polypeptides related to PRO 1926. 

Specific Utility is defined as utility which is "specific to the subject matter claimed," in 
contrast to "a general utility that would be applicable to the broad class of the invention." 
M.P.E.P. § 2107.01 I. Appellants submit that the evidence of differential expression of the 
PRO 1926 gene and polypeptide in esophageal tumor cells, along with the declarations and 
references discussed above, provide a specific utility for the claimed polypeptides. 

As discussed above, there are significant data which show that the gene for the PRO 1926 
polypeptide is expressed at least two-fold higher in normal esophageal tissue compared to 
esophageal tumor. These data are strong evidence that the PRO 1926 gene and polypeptide are 
associated with esophageal tumors. Thus, contrary to the assertions of the Examiner, Appellants 
have provided evidence associating the PRO 1926 gene and polypeptide with a specific disease. 
The asserted utility for polypeptides related to the PRO 1926 polypeptide as diagnostic tools for 
cancer, particularly esophageal tumor, is a specific utility - it is not a general utility that would 
apply to the broad class of polypeptides. 
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8. The Examiner Response to Appellants* Evidence is Insufficient to Rebut 

Appellants ' Arsuments 
The Examiner has stated that the Grimaldi and Polakis declarations are "insufficient to 
overcome the rejection of claims 4-8 and 11-17" based on 35 U.S.C. §§ 101 and 112. Final 
Office Action at 7. In addition, the Examiner has summarily dismissed Appellants' supporting 
references. Id. at 11-12. 

a. The Examiner *s response to the First Grimaldi Declaration 

The Examiner's only response to the first Grimaldi Declaration is in response to the 
statement in ^ 7 of the Declaration that differences in mRNA expression also render the protein 
useful as a diagnostic tool. The Examiner states that "there is no description in the specification 
[] that would indicate a correlation with higher or lower expression levels of the message to the 
PRO 1926 polypeptide." Final Office Action at 8. 

The Examiner's statement has nothing to do with the accuracy of the conclusions 
expressed in the first Grimaldi Declaration, and does not provide a basis for rejecting Mr. 
Grimaldi 's conclusions. In addition, Appellants have provided numerous references and the 
declaration of an additional expert which indicates that it is well established that generally, there 
is a correlation between changes in mRNA level and changes in the level of the corresponding 
protein. 

Appellants submit that the declaration of Mr. Grimaldi is based on personal knowledge of 
the relevant facts at issue. Mr. Grimaldi is an expert in the field and conducted or supervised the 
experiments at issue. Appellants have reminded the Examiner that "Office personnel must 
accept an opinion from a qualified expert that is based upon relevant facts whose accuracy is not 
being questioned." PTO Utility Examination Guidelines (2001) (emphasis added). In addition, 
declarations relating to issues of fact should not be summarily dismissed as "opinions" without 
an adequate explanation of how the declaration fails to rebut the Examiner's position. In re 
Alton 76 F.3d 1 168 (Fed. Cir. 1996). 

Mr. Grimaldi has personal knowledge of the relevant facts, has based his opinion on 
those facts, and the Examiner has offered no reason or evidence to reject either the underlying 
facts or his opinion. Therefore, the Examiner and Board should accept Mr. Grimaldi 's opinion 
with regard to his statement that "any visually detectable difference seen between two samples is 
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indicative of at least a two-fold difference in cDNA between the tumor tissue and the counterpart 
normal tissue" and that the nucleic acids of interest "can be used to differentiate tumor from 
normal." Together, these statements establish that there is at least a two-fold difference in 
expression, and that the results are reliable enough that they can be used to distinguish tumor 
from normal tissue. 

b. The Examiner's response to the Second Grimaldi Declaration 
In response to the second Grimaldi Declaration, the Examiner focuses on paragraph 4 of 
the declaration where it states that for chromosomal aberrations which result in aberrant 

expression of a mRNA and corresponding protein, "the gene product is a promising target for 
cancer therapy , for example, by the therapeutic antibody approach." Final Office Action at 8-9 
(emphasis added). The Examiner rejects this argument, stating that it was not persuasive because 
unlike the genes discussed in the references cited in the declaration, "[t]he PR01926 gene, ... has 
not been associated with tumor formation or the development of cancer, nor has it been shown to 
be predictive of such. Similarly, ... no translocation of PR01926 is known to occur. ... No 
mutation or translocation of PRO 1926 has been associated with for example, esophageal tumor." 
Final Office Action at 9 (emphasis in original). The Examiner concluded that "in the absence of 
any of the above information" the disclosure was insufficient to satisfy the requirements of § 101. 
Id, 

The Examiner's arguments fail to establish that one of skill in the art would doubt 
Appellants' asserted utility. Once again, the Examiner has failed to establish how the "absence 
of any of the above information" is relevant to the asserted utility by supplying evidence or 
reasoning to support his assertion. See In re Brana, 51 F.3d 1560, 1566, 34 U.S.P.Q.2d 1436 
(Fed. Cir. 1995) ("Only after the PTO provides evidence showing that one of ordinary skill in the 
art would reasonably doubt the asserted utility does the burden shift to the applicant to provide 
rebuttal evidence.") (emphasis added). 

The lack of a known role for PRO 1926 in tumor formation or the development of cancer 
does not prevent its use as a diagnostic tool for cancer. Likewise, the fact that there is no known 
translocation or mutation of PRO 1926 is irrelevant to whether its differential expression can be 
used to assist in diagnosis of cancer - one does not need to know why PRO 1926 is differentially 
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expressed, or what the consequence of the differential expression is, in order to exploit the 
differential expression to distinguish tumor from normal tissue. 

The Revised Interim Utility Guidelines promulgated by the PTO recognize that proteins 
which are differentially expressed in cancer have utility. In the caveat to Example 12, the 
Guidelines state that the utility requirement is satisfied for a protein that is expressed on 
melanoma cells but not on normal skin, and that antibodies against the protein can be used to 
diagnose cancer. The specification in Example 12 teaches nothing about the role of the 
hypothetical protein in cancer formation or development. In addition, while Appellants 
appreciate that actions taken in other applications are not binding on the PTO with respect to the 
present application, Appellants note that the PTO has issued several patents claiming 
differentially expressed polypeptides. See, e.g., U.S. Patent No. 6,414,117, and U.S. Patent No. 
6,124,433 (submitted as Exhibits 3 and 4 to Appellants' Amendment and Response to Office 
Action). 

In addition. Appellants note that they did not even rely on the portion of the second 
Grimaldi declaration cited by the Examiner which discusses targets for cancer therapy . Instead, 
Appellants submitted the second Grimaldi declaration in support of the assertion that changes in 
mRNA are positively correlated to changes in protein levels. Appellants relied on paragraph 5 of 
the declaration which states: 'Those who work in this field are well aware that in the vast 
majority of cases, when a gene is over-expressed... the gene product or polypeptide will also be 
over-expressed.... This same principal applies to gene under-expression," Amendment and 
Response to Office Action at 19, quoting Second Grimaldi Declaration at \ 5. As support for this 
statement, Mr. Grimaldi noted that "[tjechniques used to detect mRNA, such as Northern 
Blotting, Differential Display, in situ hybridization, quantitative PGR, Taqman, and more 
recently Microarray technology all rely on the dogma that a change in mRNA will represent a 
similar change in protein. If this dogma did not hold true then these techniques would have little 
value and not be so widely used." Second Grimaldi Declaration at 5. Whether the differential 
expression of mRNA is due to mutations or translocations has no bearing on the portion of the 
Grimaldi reference relied on by Appellants. 

In conclusion, the Examiner has not provided any evidence or reasoning to reject the 
second Grimaldi Declaration, particularly \5 on which Appellants rely. Appellants reiterate that 
"Office personnel must accept an opinion from a qualified expert that is based upon relevant 
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facts whose accuracy is not being questioned." PTO Utility Examination Guidelines (2001) 
(emphasis added). 

c. The Examiner's response to the Polakis Declaration 
In response to the Polakis Declaration, the Examiner makes two arguments. First, the 
Examiner states: 



The specification describes only mRNA expression data. The argument presented 
evinces that instant specification provides a mere invitation to experiment, and not 
readily available utility. Furthermore, as indicated above the literature cautions 
researchers against drawing conclusions based on small changes in transcript 
expression levels between normal and cancerous tissue (see Hu et al discussions 
above). It is also not known whether PRO 1926 polypeptide is expressed in normal 
skin tissue. There is no nexus between the mRNA expression and PRO 1926 
polypeptide. In the absence of any of the above information, all that the 
specification does is present evidence that the mRNA encoding PRO 1926 is 
present at higher levels in normal skin tissues compared to melanoma tumor 
tissues counterparts, and invite the artisan to determine the rest of the story. This 
is further borne out by Grimaldi assertion that "additional studies can then be 
conducted if further information is desired" (Appendix A, paragraph 7). Such is 
insufficient to meet the requirements of 35 U.S.C. § 101 utility for the claimed 
protein. Final Office Action at 9-10. 

This argument is not responsive to the Polakis Declaration, particularly the passage in 

paragraph 6 which states that changes in mRNA level lead to changes in protein level: 

Based on my own experience accumulated in more than 20 years of research, 
including the data discussed in paragraphs 4 and 5 above and my knowledge of 
the relevant scientific literature, it is my considered scientific opinion that for 
human genes, an increased level of mRNA in a tumor cell relative to a normal cell 
typically correlates to a similar increase in abundance of the encoded protein in 
the tumor cell relative to the normal cell In fact, it remains a central dogma in 
molecular biology that increased mRNA levels are predictive of corresponding 
increased levels of the encoded protein. Polakis Declaration at 6 (emphasis 
added). 

Paragraphs 4 and 5 of the Polakis Declaration disclose that in the course of his research 
which is closely related to the instant invention, Dr. Polakis has identified approximately 200 
gene transcripts that are differentially expressed in human tumors. He has generated antibodies 
to about 30 of the protein products. In paragraph 5, he states that: 
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[T]here is a strong correlation between changes in the level of mRNA present in 
any particular cell type and the level of protein expressed from that mRNA in that 
cell type. In approximately 80% of our observations we have found that increases 
in the level of a particular mRNA correlates with changes in the level of protein 
expressed from that mRNA when human tumor cells are compared with their 
corresponding normal cells. Polakis Declaration at f 5. 

Clearly, paragraphs 4 and 5 provide significant evidentiary support for his conclusions in 
paragraph 6, As to his statement that it is a central dogma of molecular biology that increases in 
mRNA lead to increases in protein, this statement is also supported by the data in paragraphs 4 
and 5, as well as Dr. Polakis' expertise and more than 20 years of research in the field. 

Appellants remind the Board that "Office personnel must accept an opinion from a 
qualified expert that is based upon relevant facts whose accuracy is not being questioned." PTO 
Utility Examination Guidelines (2001) (emphasis added). In addition, declarations relating to 
issues of fact should not be summarily dismissed as "opinions" without an adequate explanation 
of how the declaration fails to rebut the Examiner's position. In re Alton 76 F.3d 1 168 (Fed. Cir. 
1996). As the Examiner has not provided any reason or evidence to challenge the factual basis 
for Dr. Polakis' opinion, it must be accepted as true. 

As Appellants have already addressed the Hu reference above, Appellants next address 
the Examiner's statement that "[t]here is no nexus between the mRNA expression and PRO 1926 
polypeptide." Final Office Action diX 10. 

There is an obvious nexus between PRO 1926 mRNA expression and the PRO 1926 
polypeptide. As described in numerous references and declarations above, regulation of mRNA 
is the primary method for controlling the expression of a gene, and there is a general correlation 
between changes in mRNA levels and changes in protein levels. Because PRO 1926 mRNA 
encodes the PRO 1926 polypeptide, changes in PRO 1926 mRNA levels lead to changes in 
PRO 1926 polypeptide levels - the nexus between mRNA and the encoded protein is well- 
established. 

Appellants also address the Examiner's statement that further research is required for the 
invention, and that this requirement "is home out by Grimaldi assertion that 'additional studies 
can then be conducted if fiirther information is desired.'" Final Office Action at 10. 

The Examiner's reliance on the quote from the first Grimaldi declaration is clearly 
misplaced when read in context: 
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7. The results of the gene expression studies indicate that the genes of 
interest can be used to differentiate tumor from normal. The precise levels of 
gene expression are irrelevant; what matters is that there is a relative difference in 
expression between normal tissue and tumor tissue. . . Jf a difference is detected, 
this indicates that the gene and its corresponding polypeptide and antibodies 
against the polypeptide are useful for diagnostic purposes, to screen samples 
to differentiate between normal and tumor. Additional studies can then be 
conducted if further information is desired . First Grimaldi Declaration at Tl 7 
(emphasis added). 

It is obvious that Mr. Grimaldi was stating that it is his expert opinion that the 
information provided in Example 18 is sufficient to use the gene, protein and antibody as 
diagnostic tools, and that no further testing or information is required. However, if additional 
information is desired, such as the role of the gene or protein in cancer formation or growth, 
additional studies can be conducted. It is disingenuous of the Examiner to take this quote out of 
context to suggest that Mr. Grimaldi is stating that further research is required to use the claimed 
invention when the remainder of his declaration clearly states otherwise. 

The Examiner's second response to the statement in the Polakis Declaration that there is a 
correlation between changes in the level of mRNA and changes in the level of the encoded 
protein is: 

[I]t is important to note that the instant specification provides no information 
regarding protein levels. Only mRNA expression data was [sic] presented. 
Therefore the declaration is insufficient to overcome the rejection of claims 4-8 
and 11-17 based upon 35 U.S.C. § 101 and 1 12, first paragraph, since it is limited 
to a discussion of data regarding the correlation of mRNA levels and polypeptide 
levels. Final Office Action at 10. 

The Polakis Declaration presents the opinion of an expert based on data, the scientific 
literature and more than 20 years of research experience that "an increased level of mRNA in a 
tumor cell relative to a normal cell typically correlates to a similar increase in abundance of the 
encoded protein in the tumor cell relative to the normal cell" and that "it remains a central dogma 
in molecular biology that increased mRNA levels are predictive of corresponding increased 
levels of the encoded protein." Polakis Declaration at \ 6. As the Examiner has acknowledged, 
the data in Example 18 are differential mRNA data regarding the level of PRO 1926 mRNA in 
tumor samples compared to normal tissue samples. Therefore, Dr. Polakis' statement regarding 
a correlation between mRNA levels and protein levels is very probative, and demonstrates that 
one of skill in the art would believe that PRO 1926 polypeptide is differentially expressed in 
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esophageal tumors. The Examiner's arguments are simply not responsive to Dr. Polakis' 
declaration. 

rf. The Examiner response to Appellants' Supportins References 
Finally, Appellants turn to the Examiner's response to Appellants' supporting references 
discussed above. The Examiner acknowledges the submission of two Alberts references, as well 
as the Lewin, Zhigang, and Meric references. See Final Office Action at 11-12. However, the 
Examiner only responds to the Meric reference, essentially ignoring the remaining references. 
The Examiner states that "[f|urther reading of Meric et al casts doubts on Applicants claim that 
there is a direct correlation between increased mRNA levels and the level of expression of the 
encoded protein. For example, the reference discusses that variations in mRNA sequences 
increase or decrease translational efficiency as found in BRCAl (see pages 973-974)." Final 
Office Action at 12. 

This argument is not responsive to Appellants' argument that Meric teaches that "[t]he 
fundamental principal of molecular therapeutics in cancer is to exploit the differences in gene 
expression between cancer and normal cells." Meric at 971 . Meric does teach that mutations of 
genes as well as alternate splicing and alternate transcription start sites can lead to altered 
translation efficiency in certain cancer cells. Id. at 973-974. As support, Meric cites three 
examples of point mutations, and four examples of altemate splicing. Id at 974. However, the 
Examiner has not shown, and there is no evidence, that the PRO 1926 mRNA is either mutated, 
alternately spliced, or has an altemate transcription start site. Nor has the Examiner established 
that point mutations or altemate splice variants leading to changes in translation efficiency are 
common in cancer, or common in esophageal cancer in particular. These few examples are not 
sufficient to provide evidence that one skilled in the art would reasonably doubt Appellants' 
asserted utility, or reject the teaching of Appellants' supporting references and declarations. 

As the supporting references and declarations Appellants have discussed above make 
clear, regulation of mRNA levels is the predominant control mechanism for the majority of 
genes. Meric supports this assertion because "[t]he fundamental principle of molecular 
therapeutics in cancer is to exploit the differences in gene expression between cancer cells and 
normal cells." Meric et al at 971 (emphasis added). The only reason mRNA is of any interest in 
studying the mechanism of cancer formation and growth is because mRNA encodes protein. If 
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there were no general correlation between differences in mRNA and differences in protein, there 
would be no reason to study changes in mRNA. 

In conclusion, Appellants have offered sufficient evidence to estabHsh that it is more 
likely than not that one of skill in the art would believe that because the PRO 1926 mRNA is 
underexpressed in esophageal tumors compared to normal esophageal tissue, the PRO 1926 
polypeptide will also be underexpressed in esophageal tumors compared to normal esophageal 
tissue. This differential expression of the PRO 1926 polypeptide makes it useful as a diagnostic 
tool for cancer. In short, none of the Examiner's responses to Appellants' supporting evidence 
are sufficient to rebut Appellants' asserted utility. 

9. The Courts have held that the Utility Requirement was Satisfied in Similar 
Cases 

The seminal decision interpreting the utility requirement of 35 U.S.C. § 101 is Brenner v. 
Manson, 383 U.S. 519, 148 U.S.P.Q. 689 (1966). At issue in Brenner was a claim to "a 
chemical process which yields an already known product whose utility - other than as a possible 
object of scientific inquiry - ha[d] not yet been evidenced." Id. at 529, 148 U.S.P.Q. at 693. The 
Patent Office rejected the claimed process for lack of utility because the product produced by the 
claimed process had no known use. See id. at 521-22, 148 U.S.P.Q. at 690. On appeal, the Court 
of Customs and Patent Appeals reversed, holding "where a claimed process produces a known 
product it is not necessary to show utility for the product." Id. at 522, 148 U.S.P.Q. at 691. 

In reviewing the lower court's decision, the Court made its oft quoted statement that 
"[t]he basic quid pro quo contemplated by the Constitution and the Congress for granting a 
patent monopoly is the benefit derived by the public from an invention with substantial utility. 
Unless and until a process is refined and developed to this point - where specific benefit exists in 
currently available form - there is insufficient justification for permitting an Appellant to engross 
what may prove to be a broad field." Id. at 534-35, 148 U.S.P.Q. at 695. 

The first opinion of the C.C.P.A, applying Brenner was In re Kirk, 376 F.2d 936, 153 
U.S.P.Q. 48 (C.C.P.A. 1967). The invention claimed in Kirk was a set of steroid derivatives said 
to have valuable biological properties and to be of value "in the furtherance of steroidal research 
and in the application of steroidal materials to veterinary or medical practice." Id. at 938, 153 
U.S.P.Q. at 50. In affirming the claim rejection based on a lack of utility, the court held that the 
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"nebulous expressions 'biological activity' or 'biological properties'" did not adequately convey 
how to use the claimed compounds." Id. at 941, 153 U.S.P.Q. at 52. The court also rejected 
Appellants' supporting affidavit, stating, "the sum and substance of the affidavit appears to be 
that one of ordinary skill in the art would know 'how to use' the compounds to find out in the 
first instance whether the compounds are - or are not - in fact useful or possess useful properties, 
and to ascertain what those properties are." Id. at 942, 153 U.S.P.Q. at 53. 

Since these early decisions, the courts have continued to clarify what is sufficient to 
satisfy the utility requirement. Three more recent decisions are of particular relevance to the 
instant application: Nelson v. Bowler, 626 F.2d 853, 206 U.S.P.Q. 881 (C.C.P.A. 1980), Cross v. 
lizuka, 753 F.2d 1040, 224 U.S.P.Q. 739 (Fed. Cir. 1985), and Fujikawa v. Wattanasin, 93 F.3d 
1559, 39 U.S.P.Q. 2d 1895 (Fed. Cir. 1996). 

The earliest of these cases, Nelson v. Bowler^ involved an interference between two 
applications related to derivatives of naturally occurring prostaglandins (PG). Nelson, 626 F.2d 
at 854-55. The issue was whether Nelson had shown at least one utility for the compounds at 
issue to establish an actual reduction to practice. Id. at 855. The Appellants relied on two tests 
to prove practical utility: an in vivo rat blood pressure (BP) test and an in vitro gerbil colon 
smooth muscle stimulation (GC-SMS) test. In the BP test, the blood pressure of anesthetized 
rats was recorded on a polygraph chart to determine whether an injected compound had any 
effect. Responses were categorized as either a depressor (lowering) effect or a pressor 
(elevating) effect. Id. In the GC-SMS test a section of colon was excised fi-om a freshly-killed 
gerbil for suspension in a physiological solution, and a lever arm was connected to the colon in 
such a way that any contraction was recorded as a polygraph trace. Id. The Board held that 
Nelson had not shown adequate proof of practical utility, characterizing the tests as "rough 
screens, uncorrected with actual utility." Id. at 856. 

On appeal the C.C.P.A. reversed, holding that the Board "erred in not recognizing that 

tests evidencing pharmacological activity may manifest a practical utility even though they may 

not establish a specific therapeutic use." Id. The Court stated that "practical utility" was 

characterized as a use of the claimed discovery in a marmer which provides some immediate 

benefit to the public, establishing the following rule: 

Knowledge of the pharmacological activity of any compound is obviously 
beneficial to the public. It is inherently faster and easier to combat illnesses and 
alleviate symptoms when the medical profession is armed with an arsenal of 
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chemicals having known pharmacological activities. Since it is crucial to provide 
researchers with an incentive to disclose pharmacological activities in as many 
compounds as possible, we conclude that adequate proof of any such activity 
constitutes a showing of practical utility. Id (emphasis added). 

The Court rejected Bowler's argument that the BP and GC-SMS tests are inconclusive 
showings of pharmacological activity since confirmation by statistically significant means did 
not occur until after the critical date. The Court stated that "a rigorous correlation is not 
necessary where the test for pharmacological activity is reasonably indicative of the desired 
response." Id (emphasis added). The Court concluded that a '' reasonable correlation " between 
the observed properties and the suggested use was sufficient to establish practical utility. Id, at 
857. 

The sufficiency of a "reasonable correlation" in establishing utility was affirmed by the 
Court of Appeals for the Federal Circuit in Cross v. lizuka, 753 F.2d 1040, 224 U.S.P.Q. 739 
(Fed. Cir. 1985). In Cross, the subject of the interference before the Court was imidazole 
derivative compounds which inhibit the synthesis of thromboxane synthetase, an enzyme which 
leads to the formation of thromboxane A2. At the time the applications were filed, 
thromboxane A2 was postulated to be involved in platelet aggregation, which was associated 
with several deleterious conditions. Id. at 1042. 

The question before the Board and reviewed by the Court was whether lizuka was 
entitled to the benefit of his Japanese priority application. Id. The Japanese application 
disclosed that the imidazole derivatives showed strong inhibitory action for thromboxane 
synthetase from human or bovine platelet microsomes, an in vitro utility. Id. at 1043. Relying in 
part on Nelson, the Board held that tests evidencing pharmacological activity may manifest a 
practical utility even though they may not establish a specific therapeutic use, and concluded that 
the in vitro tests were sufficient to establish a practical utility. Id. 

On appeal. Cross argued that the basic in vitro tests conducted in cellular fi-actions did not 
establish a practical utility for the claimed compounds, and that more sophisticated in vitro or in 
vivo tests were necessary to establish a practical utility. Id. at 1050. The Court rejected this 
argument, noting that adequate proof of any pharmaceutical activity constitutes a showing of 
practical utility. Id, The Court accepted the argument that initial testing of compounds is widely 
done in vitro: 
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[I]n vitro results... are generally predictive of in vivo test results, i.e., there is a 
reasonable correlation therebetween. Were this not so, the testing procedures of 
the pharmaceutical industry would not be as they are. lizuka has not urged, and 
rightly so, that there is an invariable exact correlation between in vitro test results 
and in vivo test results. Rather, lizuka' s position is that successful in vitro testing 
for a particular pharmacological activity establishes a significant probability that 
in vivo testing for this particular pharmacological activity will be successful. Id 
(emphasis added). 

The Court also noted that in previous decisions, its predecessor court had accepted 

evidence of in vivo utility as sufficient to establish practical utility. The Court reasoned that: 

This in vivo testing is but an intermediate link in a screening chain which may 
eventually lead to the use of the drug as a therapeutic agent in ^humans. We 
perceive no insurmountable difficulty, under appropriate circumstances, in finding 
that the first link in the screening chain, in vitro testing, may establish a practical 
utility for the compound in question. Successful in vitro testing will marshal 
resources and direct the expenditure of effort to further in vivo testing of the most 
potent compounds, thereby providing an immediate benefit to the public, 
analogous to the benefit provided by the showing of an in vivo utility . Id at 1051, 
citing Nelson, 626 F.2d at 856 (emphasis added). 

Based on this reasoning, the Court affirmed the decision of the Board, stating that "based 
upon the relevant evidence as a whole, there is a reasonable correlation between the disclosed in 
vitro utility and an in vivo activity, and therefore a rigorous correlation is not necessary where 
the disclosure of pharmacological activity is reasonable based upon the probative evidence." Id, 
at 1050 (emphasis added). The Court therefore held that the disclosed in vitro utility was 
"sufficient to comply with the practical utility requirement of § 101 Id. at 105 1 . 

The holdings of Nelson and Cross were more recently affirmed in Fujikawa v. 

Wattanasin, 93 F.3d 1559, 39 U.S.P.Q.2d 1895 (Fed. Cir. 1996). In Fujikawa, the Court again 

affirmed the notion that initial screens of compounds provide a practical utility even though they 

may not provide a therapeutic use because "'[i]t is inherently faster and easier to combat 

illnesses and alleviate symptoms when the medical profession is armed with an arsenal of 

chemicals having known pharmacological activities.'" Id. at 1564, quoting Nelson, 626 F.2d at 

856. The Court noted that it may be difficult to predict whether novel compounds will exhibit 

pharmacological activity, and consequently testing is often required to establish practical utility. 

Id. However the Court went on to state: 

But the test results need not absolutely prove that the compound is 
pharmacologically active. All that is required is that the tests be ''reasonably 
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indicative of the desired [pharmacological] response." In other words, there must 

be a sufficient correlation between the tests and an asserted pharmacological 
activity so as to convince those skilled in the art, to a reasonable probabiHty , that 
the novel compound will exhibit the asserted pharmacological behavior." Id, 
(internal citations omitted, underline emphasis added, italics in original). 

On appeal, Fujikawa argued that Wattanasin had failed to establish an adequate 
correlation between the in vitro and in vivo results to permit Wattanasin to rely on positive in 
vitro results to establish a practical utility. The Court stated that the Board relied on testimony 
from those skilled in the art that the in vitro results convinced the experts that the claimed 
compounds would exhibit the desired pharmacological activity when administered in vivo, 
including testimony that in vivo activity is typically highly correlatable to a compound's in vitro 
activity in the field. Id at 1565. To overcome this evidence and counter the Board's decision, 
Fujikawa pointed to the testimony of its expert that "there is a reasonable element of doubt that 
some elements may be encountered which are active in the in vitro assay, but yet inactive in the 
mv/vo assay." Id. 

The Court rejected this argument: "Of course, it is possible that some compounds active 
in vitro may not be active in vivo. But, as our predecessor court in Nelson explained, a 'rigorous 
correlation' need not be shown in order to establish practical utility; 'reasonable correlation' 
suffices ." Id. (emphasis added). The Court also rejected Fujikawa's reliance on two articles. 
The Court noted that while one article taught that "m vitro testing is sometimes not a good 
indicator of how potent a compound will be in vivo, it does imply that compounds which are 
active in vitro will normally exhibit some in vivo activity." Id, at 1566. Similarly, the Court 
noted that the second article expressly stated that "[f|or most substances, although not for all, the 
relative potency determined in in vitro . . . parallels the in vivo activity." Id, 

The Court concluded that the facts in the case were analogous to the ones in Cross where 
the court relied on a known reasonable correlation between in vitro tests and in vivo activity, and 
therefore affirmed the Board's decision that Wattanasin had established a practical utility with 
the in vitro results. Id, at 1565-66. 

The Nelson, Cross, and Fujikawa cases are very similar to the present case. The 
reasoning of the courts in all three cases that "'[i]t is inherently faster and easier to combat 
illnesses and alleviate symptoms when the medical profession is armed with an arsenal of 
chemicals having known pharmacological activities'" applies to the asserted utility for the 
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claimed polypeptides. Fujikawa, 93 F.3d at 1564, quoting Nelson, 626 F.2d at 856; see also 
Cross, 753 F.2d at 1051 ("Successful in vitro testing will marshal resources and direct the 
expenditure of effort to further in vivo testing of the most potent compounds, thereby providing 
an immediate benefit to the public, analogous to the benefit provided by the showing of an in 
vivo utility."). Like pharmaceutical compounds, nucleic acids, polypeptides, and antibodies 
which are associated with cancer will make it inherently faster and easier to combat cancer. The 
greater the number of biological markers of cancer medical professionals have access to, the 
more accurate and detailed a diagnosis they can make. The determination that a gene is 
differentially expressed in cancer constitutes at least as significant a development in the field of 
cancer diagnostics as in vitro screening for pharmaceutical activity. See Cross, 753 P. 2d at 1051 
("the first link in the screening chain, in vitro testing, may establish a practical utility for the 
compound in question. Successful in vitro testing will marshal resources and direct the 
expenditure of effort to further in vivo testing of the most potent compounds, thereby providing 
an immediate benefit to the public"). 

In addition, like in vitro tests in the pharmaceutical industry, those of skill in the field of 
biotechnology rely on the reasonable correlation that exists between gene expression and protein 
expression (see discussion supra). Were there no reasonable correlation between the two, the 
techniques that measure gene levels such as microarray analysis, differential display, and 
quantitative PGR would not be so widely used by those in the art. See Second Grimaldi 
Declaration at T| 5. As in Cross, Appellants here do not argue that there is "an invariable exact 
correlation" between gene expression and protein expression. See Cross, 753 F.2d at 1050. 
Instead, Appellants' position detailed above is that a measured change in gene expression in 
cancer cells establishes a "significant probability" that the expression of the encoded polypeptide 
in cancer will also be changed based on "a reasonable correlation therebetween." Id/, see also 
Fujikawa, 93 F.3d at 1565 ("a 'rigorous correlation' need not be shown in order to establish 
practical utility; 'reasonable correlation' suffices"); Nelson, 626 F.2d at 857 (holding that "a 
rigorous correlation is not necessary" and that a "reasonable correlation" will suffice). 

Also of importance is the Court's rejection of the notion that any in vitro testing must be 
statistically significant to support a practical utility. Nelson, 626 F.2d at 857. Likewise, 
qualitative characterizations of a test compound as either increasing or decreasing blood pressure 
was acceptable. Id. at 855 (stating that responses were categorized as either a depressor 
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(lowering) effect or a pressor (elevating) effect). This is similar to the data in Example 18, 
where the change in mRNA levels is described as "more highly expressed." 

There are additional similarities. In Fujikawa, the Board and Court rejected the argument 
that there was no utility because there was no exact correlation between the in vitro and in vivo 
results in spite of supporting testimony and references. Fujikawa, 93 F.3d at 1565-66. Like the 
two references rejected by the Board and Court in Fujikawa, the Chen et al reference cited by 
the Examiner may suggest that the correlation between changes in mRNA levels and protein 
levels is not exact. But like Fujikawa, portions of Chen et al also support Appellants' assertion, 
and Appellants have submitted the declaration of two experts in the field which state that those in 
the field rely on the correlation between changes in mRNA and protein. See Second Grimaldi 
Declaration at T| 5; Polakis Declaration at ^ 6. Thus, as was the case in Fujikawa, although there 
may be some evidence that the correlation relied on is not exact, the declarations and numerous 
references submitted by Appellants is more than enough evidence to establish that there is a 
"reasonable correlation" between changes in mRNA levels and changes in the level of the 
encoded protein. 

In conclusion, Appellants have asserted that the claimed polypeptides are useful for the 
diagnosis of cancer, particularly esophageal cancer based on the data in Example 18. This utility 
is far beyond the nebulous expressions "biological activity" or "biological properties" rejected in 
In re Kirk, 376 F.2d 936, 153 U.S.P.Q. 48 (C.C.P.A. 1967). Like Nelson, Cross, and Fujikawa, 
Appellants have asserted a utility which relies on a reasonable correlation between the data 
disclosed in the application and the asserted utility. The fact that there may be limited evidence 
that the correlation is not exact does not invalidate Appellants' showing of utility since the 
correlation need not be a rigorous or exact one. Considering the relevant evidence as a whole. 
Appellants have provided sufficient evidence to establish a reasonable correlation between 
changes in the level of mRNA and corresponding changes in the level of the encoded 
polypeptide. Therefore the claimed polypeptides have a practical utility as diagnostic tools for 
esophageal cancer. 

10. Utility - Conclusion 

Appellants' asserted utility for the claimed polypeptides as diagnostic tools for cancer 
corresponds in scope to the subject matter sought to be patented and therefore "must be taken as 
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sufficient to satisfy the utility requirement of § 101 for the entire claimed subject." In re Longer, 
503 F.2d 1380, 1391, 183 U.S.P.Q. 288, 297 (C.C.P.A. 1974). The Examiner's unsupported 
arguments and references are not sufficient evidence to make a prima facie showing that "one of 
ordinary skill in the art would reasonably doubt the asserted utility." In re Brana, 51 F.3d 1560, 
1566, 34 U.S.P.Q.2d 1436 (Fed. Cir. 1995). 

And even if the Examiner has established a prima facie case, Appellants have offered 
sufficient rebuttal evidence in the form of expert declarations and references, which, when 
considered as a whole, establish that it is more likely than not that the asserted utility is true. See 
In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d 1443, 1444 (Fed. Cir. 1992) (stating that the 
evidentiary standard to be used throughout ex parte examination in setting forth a rejection is a 
preponderance of the evidence, or "more likely than nof standard); M,P.E.P. at § 2107.02, part 
VII ("evidence will be sufficient if, considered as a whole, it leads a person of ordinary skill in 
the art to conclude that the asserted utility is more likely than not true .") (emphasis in original). 

Finally, the courts' decisions in similar cases make clear that the evidence provided by 
Appellants is sufficient to establish the asserted utility. The evidence does not need to be direct 
evidence, nor does it need to provide an exact correlation between the submitted evidence and 
the asserted utility. Instead, evidence which is "reasonably" correlated with the asserted utility is 
sufficient. See Fujikawa, 93 F.3d at 1565 ("a 'rigorous correlation' need not be shown in order 
to establish practical utility; 'reasonable correlation' suffices"); Cross, 753 F.2d at 1050 (same); 
Nelson, 626 F.2d at 857 (same). Considering the evidence as a whole in light of the relevant 
cases, the Board should find that Appellants have established at least one specific, substantial, 
and credible utility, and the Examiner's rejection of the pending claims as lacking utility should 
be reversed. 

C. Enablement Rejection - Detailed Argument 

The second issue before the Board is whether Appellants have enabled the pending 
claims such that one of skill in the art would be able to make and use the claimed invention. The 
Examiner has rejected pending Claims 6-8 and 12-17 under 35 U.S.C. §112, first paragraph, 
arguing that because the claimed invention is not supported by either a specific or substantial 
asserted utility or a well-established utility, one skilled in the art clearly would not know how to 
use the claimed invention. See First Office Action at 10. In addition, the Examiner makes the 
conclusory statement that even if the specification taught how to use the PRO 1926 polypeptide, 
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enablement would not be commensurate in scope with claims which encompass percent variants 

and fragments of SEQ ID NO: 136 because there is no structural or functional information 

provided in the specification. Office Action at 14-15. In addition, without any reasoning, 

analysis or factual support, the Examiner summarily states that: 

In addition, the lack of direction/guidance presented in the specification regarding 
which variants of polypeptides of SEQ ID NO: 136 would retain the desired 
activity, the complex nature of the invention, the state of the prior art establishing 
that biological activity cannot be predicted based on structural similarity, the 
absence of working examples directed to variants and the breath of claims, undue 
experimentation would be required of the skilled artisan to make and/or use the 
claimed invention in its fiill scope. Office Action at 15. 

Appellants submit that Claims 6-8 and 12-17 are enabled such that one of skill in the art 
could make and use the claimed polypeptides without undue experimentation. With respect to 
Claims 6-8 and 12-13, how to make the polypeptide of SEQ ID NO:136 and the polypeptide 
encoded by the cDNA deposited under ATCC accession number 203547 is within the skill in the 
art. Similarly, with respect to Claims 14-17, it is well within the skill of those in the art to make 
polypeptides that are at least 95% identical to SEQ ID NO: 136 and the polypeptide encoded by 
ATCC 203547, and it is well within those of skill in the art to make antibodies which are specific 
to a disclosed sequence. See also In re Wands, 858 F.2d 731 (reversing the Board's decision of 
non-enablement and holding that as of 1980, undue experimentation was not required to make 
high-affinity monoclonal antibodies to a target peptide). Thus, one of skill in the art would be 
able to make the claimed polypeptides without undue experimentation. 

As described above, Appellants assert that the claimed polypeptides are useful as 
diagnostic tools for cancer, particularly esophageal cancer. This use is based on the disclosure in 
Example 18 of the instant application that the nucleic acid encoding the PRO 1926 polypeptide is 
at least two-fold differentially expressed in esophageal tumor relative to normal esophageal 
tissue. As detailed above, it is well-established that changes in expression levels of mRNA leads 
to corresponding changes in expression levels of the encoded polypeptide, and thus it is likely 
that the PRO 1926 polypeptide also is differentially expressed in esophageal tumors. Thus, based 
on the disclosure in the application, one of skill in the art would be able to use the claimed 
polypeptides as diagnostic tools to distinguish suspected esophageal tumors from normal tissue 
without undue experimentation. 
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L Enablement - Le2al Standard 

An application enables the claims "if one skilled in the art, after reading the[] disclosure^, 
could practice the invention claimed ... without undue experimentation." Chiron Corp. v. 
Genentech Inc., 363 F.3d 1247, 1253 (Fed. Cir. 2004). "But the question of undue 
experimentation is a matter of degree. The fact that some experimentation is necessary does not 
preclude enablement; what is required is that the amount of experimentation 'must not be unduly 
extensive.'" PPG Indus., Inc. v. Guardian Indus., Corp., 75 F.3d 1558, 1564 (Fed. Cir. 1996) 
(quoting Atlas Powder Co. v. E.L DuPont de Nemours <& Co., 750 F.2d 1569, 1576 (Fed. Cir. 
1984)). 

While the application must enable one of ordinary skill in the art to practice the full scope 
of the claimed invention, "[t]hat is not to say that the specification itself must necessarily 
describe how to make and use every possible variant of the claimed invention, for the artisan's 
knowledge of the prior art and routine experimentation can often fill gaps, interpolate between 
embodiments, and perhaps even extrapolate beyond the disclosed embodiments, depending upon 
the predictability of the art." AK Steel Corp. v. Sollac, 344 F.3d 1234, 1244 (Fed. Cir. 2003). 

"Enablement is not precluded by the necessity for some experimentation such as routine 

screening. However, experimentation needed to practice the invention must not be undue 

experimentation. The key work is 'undue,' not 'experimentation.'" In re Wands 858 F.2d 731, 

736-7, 8 U.S.P.Q.2d 1400, (Fed. Cir. 1988), citations omitted. 

It is equally clear that a rejection based on "lack of utility." whether 
grounded upon 35 U.S.C. 101 or 35 U.S.C. 112. first paragraph, rests on the same 
basis (i.e.. the asserted utility is not credible) . To avoid confusion, any rejection 
that is imposed on the basis of 35 U.S.C. 101 should be accompanied by a 
rejection based on 35 U.S.C. 112, first paragraph. The 35 U.S.C. 112, first 
paragraph, rejection should be set out as a separate rejection that incorporates by 
reference the factual basis and conclusions set forth in the 35 U.S.C. 101 rejection. 
The 35 U.S.C. 112, first paragraph, rejection should indicate that because the 
invention as claimed does not have utility, a person skilled in the art would not be 
able to use the invention as claimed, and as such, the claim is defective under 35 
U.S.C. 1 12, first paragraph. A 35 U.S.C. 112, first paragraph, rejection should not 
be imposed or maintained unless an appropriate basis exists for imposing a 
rejection under 35 U.S.C. 101. In other words. Office personnel should not 
impose a 35 U.S.C> 112, first paragraph, rejection grounded on a ^Mack of 
utilitv^^ basis unless a 35 U.S>C, 101 rejection is proper In particular, the 
factual showing needed to impose a rejection under 35 U.S.C. 101 must be 
provided if a rejection under 35 U.S.C. 112, first paragraph, is to be imposed on 
"lack of utility" grounds. 
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To avoid confusion during examination, any rejection under 35 U.S.C. 1 12, 
first paragraph, based on grounds other than "lack of utility" should be imposed 
separately from any rejection imposed due to "lack of utility" under 35 U.S.C. 
101 and 35 U.S.C. 1 12, first paragraph. M.P.KP. § 2107.01 IV (emphasis added). 

2. Enablement - Burden of Proof 

In order to make an enablement rejection, the PTO has the initial burden to establish a 
reasonable basis to question the enablement provided for the claimed invention. See M.P.KP. § 
2164.04. A specification teaching how to make and use the claimed subject matter must be taken 
as being in compliance with the enablement requirement unless there is a reason to doubt the 
objective truth of the statements contained therein which are relied on for enabling support. Id 
It is incumbent for the PTO "to explain why it doubts the truth or accuracy of any statement in a 
supporting disclosure and to back up assertions of its own with acceptable evidence or reasoning 
which is inconsistent with the contested statement." Id (quoting In re Marzocchi, 439 F.2d 220, 
224, 169 U.S.P.Q. 367, 370 (C.C.P.A. 1971). This can be done "by making specific findings of 
fact, supported by the evidence, and then drawing conclusions based on these findings of fact." 
Id 

3. Enablement - Standard of Proof 

Once the examiner has weighed all the evidence and established a reasonable basis to 
question the enablement provided for the claimed invention, the burden falls on the applicant to 
present persuasive arguments, supported by suitable proofs where necessary, that one skilled in 
the art would be able to make and use the claimed invention using the application as a guide. See 
M.P.E,P. § 2164.05. "The evidence provided by applicant need not be conclusive but merely 
convincing to one skilled in the art." Id. (bold emphasis added, underline in original). "A 
declaration or affidavit is, itself, evidence that must be considered ." Id, (emphasis in original). 

The examiner must then "weigh all the evidence before him or her, including the 
specification and any new evidence supplied by applicant with evidence and/or sound scientific 
reasoning previously presented in the rejection and decide whether the claimed invention is 
enabled." Id, "The examiner should never make the determination based on personal opinion. 
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The determination should always be based on the weight of all the evidence." Id (emphasis in 
original). 

4. Appellants^ Specification Teaches How to Make and Use the Claimed Subject 
Matter 

The specification enables one skilled in the art to make and use the full scope of the 
claims without undue experimentation. The claimed subject matter relates to polypeptides of 
SEQ ID NO: 136 and the polypeptide encoded by ATCC deposit 203547, and polypeptides which 
are at least 95% identical to those polypeptides and which can be used to make antibodies that 
specifically detect PRO 1926 in esophageal tissue. 

The specification discloses how to make the claimed polypeptides, for example in 
paragraphs [0283]-[0315] and Examples 6-9 ffl [0453]-[0492]. In addition, methods for making 
polypeptides which are at least 95% identical to SEQ ID NO: 136 by making substitutions or 
deletions are also disclosed in the specification and were well known in the art. See e.g., 
Specification at paragraphs [0256]-[0271]. Methods for making and testing antibodies for 
specificity were well known in the art, and are disclosed in the specification, including 
paragraphs [0361]-[0379] and Example 10 ffl [0493]-[0499]) of the specification, which 
specifically describes the preparation of antibodies that bind PRO polypeptides. In addition, the 
specification discloses that antibodies to claimed polypeptides can be used in diagnostic assays 
to detect the expression of PRO 1926 in specific types of tissue. See e.g., Specification at [0407]. 

In light of the differential expression of the nucleic acid encoding the PRO 1926 
polypeptide in esophageal tumors compared to normal esophagus tissue, one of skill in the art 
would have expected the PRO 1926 polypepfide to be differentially expressed in these tumors as 
well. Therefore, given the teaching in the specification on how to make and use the claimed 
polypeptides to detect expression of PRO 1926 in specific tissues, one of skill in the art would 
have been enabled to practice the claimed invention without undue experimentation. 

Because Appellants' specification teaches how to make and use the claimed subject 
matter, it must be taken as being in compliance with the enablement requirement unless there is a 
reason to doubt the objective truth of the statements contained therein which are relied on for 
enabling support. See MP.E.P. §2164.04. 
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5. The Examiner *s Arsuments Fail to Establish a Reasonable Basis to Question 
the Enablement Provided for the Claimed Invention in the Specification 

The PTO has the initial burden to establish a reasonable basis to question the enablement 
provided for the claimed invention. See M.P.E.P, § 2164.04. It is incumbent for the PTO "to 
explain why it doubts the truth or accuracy of any statement in a supporting disclosure and to 
back up assertions of its own with acceptable evidence or reasoning which is inconsistent with 
the contested statement." Id, (quoting In re MarzocchU 439 F.2d 220, 224, 169 U.S.P.Q. 367, 
370 (C.C.P.A. 1971). This can be done "by making specific findings of fact, supported by the 
evidence , and then drawing conclusions based on these findings of fact." Id, 

In the first Office Action, the Examiner stated that even if the specification taught how to 
use the PRO 1926 polypeptide, enablement would not be commensurate with the scope of the 
claims, arguing: 

The specification discloses one PRO 1926 amino acid sequence with particularity. 
...The specification does not teach how to make PRO 1926 variants or fi-agments 
comprising the sequence. Since a biological function of PRO 1926 is not clear, 
and since one skilled in the art could not determine with reasonable expectation of 
success what a biological function of PRO 1926 would be, the skilled artisan 
would not be able to make PRO 1926 variants or fragments comprising the 
sequence, and test them for biological, activity. Furthermore, the specification 
provides no guidance as to how the skilled artisan could use inactive PRO 1926 
variant or fragment, as no functional limitation associated with PRO 1926 variants 
or fi'agments comprising the sequence in the claims. First Office Action at 10. 

The Examiner also argues that the problem of predicting protein structure from sequence 
data and in turn utilizing predicted structural determinations to ascertain functional aspects of the 
protein is extremely complex. Id, 

In the Final Office Action, the Examiner again makes the conclusory statement that 

enablement is not commensurate in scope with claims "because there is no structural or 

functional information provided in the specification." Office Action at 15. In addition, without 

any reasoning, analysis or factual support, the Examiner summarily states that: 

In addition, the lack of direction/guidance presented in the specification regarding 
which variants of polypeptides of SEQ ID NO: 136 would retain the desired 
activity, the complex nature of the invention, the state of the prior art establishing 
that biological activity cannot be predicted based on structural similarity, the 
absence of working examples directed to variants and the breath of claims, undue 
experimentation would be required of the skilled artisan to make and/or use the 
claimed invention in its full scope. Office Action at 15. 
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The Examiner then discusses the amendment to Claims 4-5, with only a passing reference 
to Claims 14-17 as amended: "Similarly, there is no nexus between the degree of homology and 
the ability of the antibody (generated to polypeptide or fragments) to specifically detect the 
polypeptide of SEQ ID NO: 136 in skin [sic] tissue samples," Office Action at 16. 

The Examiner's unsupported conclusory statements fail to establish a reasonable basis to 
question the enablement provided for the claimed invention. See M,P,E.P, § 2164.04. It is 
incumbent for the PTO "to explain why it doubts the truth or accuracy of any statement in a 
supporting disclosure and to back up assertions of its own with acceptable evidence or reasoning 
which is inconsistent with the contested statement." Id (quoting In re Marzocchi, 439 F.2d 220, 
224, 169 U.S.P.Q. 367, 370 (C.C.P.A. 1971). This can be done "by making specific findings of 
fact, supported by the evidence , and then drawing conclusions based on these findings of fact." 
Id, The Examiner has failed to make any specific findings of fact, or back up his assertions with 
any acceptable evidence or reasoning. 

As an initial matter, Appellants note that rejected claims 6-8 and 12-13 do not recite 
percent amino acid sequence identity as a limitation. These claims are directed to peptides of the 
disclosed sequence, with or without the disclosed signal peptide, and fusion proteins thereof 
which would be optimal, for example, in making antibodies. Therefore, any arguments based on 
a failure to enable variants are not applicable to Claims 6-8 and 12-13. 

As explained above, the specification teaches in detail how to make the claimed 
polypeptides, including variants thereof, and antibodies which specifically bind PRO 1926. 
Likewise, as detailed above, the specification provides sufficient guidance as to how to use the 
claimed polypeptides. Thus, contrary to the Examiner's unsupported conclusory statement, there 
is significant guidance how to make and use the claimed polypeptides. In addition, as the 
disclosure and references cited in the specification make clear, the production of polypeptides, 
polypeptide variants, and specific antibodies is a predictable and well established aspect of the 
biological sciences. See, e.g., In re Wands, 858 F.2d 731, 8 U.S.P.Q. 2d 1400 (Fed. Cir. 1988) 
(reversing the Board's decision of non-enablement and holding that as of 1980, undue 
experimentation was not required to make high-affinity monoclonal antibodies to a target 
peptide). 

As for the Examiner's single conclusory statement directed to Claims 14-17, Appellants 
submit that the nexus between the degree of homology and the ability of the antibody to 
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specifically detect the polypeptide of SEQ ID NO: 136 in esophageal tissue samples is obvious to 
those skilled in the art. Obviously, two polypeptides with 5% amino acid sequence homology 
will share fewer epitopes than two polypeptides with 95% amino acid sequence homology - the 
greater the degree of homology between the antigenic peptide and the target, the greater the 
likelihood that the antibody will be specific for the target peptide in the specified tissue. 

In conclusion, the Examiner has failed to meet his burden to establish a reasonable basis 
to question the enablement provided for the claimed invention - conclusory statements are 
simply not sufficient. 

6. Groupins of Rejected Claim 

For purposes of the enablement rejection, Claims 6-8 and 12-13 can be considered as a 
group. Claims 14 and 16-17 can be considered as a group, and Claim 15 should be considered 
individually. 

a. Claims 6-8 and 12-13 are enabled 

Claims 6-8 and 12-13 are enabled for the reasons discussed above. The scope of these 
claims is narrow, and because SEQ ID NO: 136, the signal peptide, and ATCC deposit 203547 
are explicitly disclosed in the specification, no experimentation of any kind is required to make 
the claimed polypeptides. One of skill in the art would clearly be able to use these polypeptides 
to make antibodies which are specific to either of these polypeptides, such that the level of 
expression of these polypeptides could be assessed in esophageal tissue. The only question is 
whether the use of these polypeptides to make antibodies to detect their expression level is a 
substantial and specific utility. For the reasons discussed at length above, Appellants believe that 
differential expression of the PRO 1926 mRNA in esophageal tumors provides the claimed 
polypeptides with a substantial and specific utility. Therefore, Claims 6-8 and 12-13 are enabled. 

b. Claims 14 and 16-1 7 are enabled 

Claims 14 and 16-17 are enabled for the reasons discussed above. The scope of these 
claims is broader than that of Claims 6-8 and 12-13. Because SEQ ID NO: 136 is 242 amino 
acids long, a polypeptide which is at least 95% identical can only have approximately 12 
deletions or substitutions. While some experimentation will be required to make these 
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polypeptides that are not identical to SEQ ID NO: 136, any such experimentation is routine in the 
art and will not be undue. One of skill in the art would clearly be able to use these polypeptides 
to make antibodies which are specific to SEQ ID NO: 136, such that the expression level of 
PRO 1926 can be assessed in esophageal tissue. The only question is whether the use of these 
polypeptides to make antibodies to detect the expression level of PRO 1926 is a substantial and 
specific utility. For the reasons discussed at length above, Appellants believe that it is, and 
therefore. Claims 14 and 16-17 are enabled. 

c. Claim 15 is enabled 

Claim 15 is enabled for the reasons discussed above. The scope of this claim is narrower 
than that of Claim 14 since a polypeptide which is at least 99% identical can only have 
approximately 2 deletions or substitutions. As a result, less experimentation will be required to 
make these polypeptides, although any experimentation remains routine. One of skill in the art 
would clearly be able to use these polypeptides to make antibodies which are specific to SEQ ID 
NO: 136, and for the reasons discussed above. Appellants believe that such use is substantial and 
specific. Therefore, Claim 15 is enabled. 

7. Enablement - Conclusion 

For the reasons discussed above, the specification enables the scope of the claimed 
invention such that one skilled in the art could make and use the claimed invention without 
undue experimentation. The Examiner has offered only conclusory statements, and has failed to 
back up his assertions "with acceptable evidence or reasoning which is inconsistent with the 
contested statement [of enablement]." Id (quoting In re Marzocchi, 439 F.2d 220, 224, 169 
U.S.P.Q. 367, 370 (C.C.P.A. 1971). Therefore, the Examiner has failed to meet his initial burden 
to establish a reasonable basis to question the enablement provided for the claimed invention. 
SeeM.P.E.P. §2164.04. 

And even if the Examiner has met his burden, Appellants have presented persuasive 
arguments, supported by the evidence discussed above with respect to utility, that one skilled in 
the art would be able to make and use the claimed invention using the application as a guide. 
Appellants remind the Board that "[t]he evidence provided by applicant need not be conclusive 
but merely convincing to one skilled in the art." M,P,E,P. § 2164.05 (emphasis in original). 
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Appellants submit that rejected Claims 6-8 and 12-13 should be considered as a group, 
Claims 14 and 16-17 should be considered as a group, Claim 15 should be considered 
individually. This is because the scope of the subject matter in each of the groups differs, and 
therefore varying amounts of experimentation will be required to make and use each of the 
groups. However, Appellants submit that even for the group with the broadest coverage, Claims 
14 and 16-17, any experimentation would be routine for those of skill in the art given the high 
level of sequence identity required for the claims. 

Considering all of the evidence provided by the Appellants to establish their asserted 
utility, along with the disclosure in the specification, the Board should find that Appellants have 
established that one of skill in the art would be able to make and use the claimed invention 
without undue experimentation, and the Examiner's rejection of the pending claims as lacking an 
enabling disclosure should be reversed. 

D. Written Description Rejection - Detailed Arguments 

The third issue before the Board is whether the claimed subject matter is described in the 
specification in such a way as to reasonably convey to one skilled in the art that the inventors had 
possession of the claimed invention at the time the application was filed. The Examiner has 
rejected pending Claims 6-8 and 12-17 under 35 U.S.C. §112, first paragraph, as lacking an 
adequate written description, stating that stating that "even a very skilled artisan could not 
envision the detailed chemical structure of all or a significant number of encompassed PRO 1926 
polypeptides, and therefore, would not know how to make or use them." Office Action at 17. 

The Examiner has failed to meet his initial burden of rebutting the presumption that the 
written description is adequate because nowhere in the Final Office Action does the Examiner 
address his arguments to Claims 6-8 and 12-17, and the arguments he has made are either flawed, 
or do not apply to Claims 6-8 and 12-17. For the reasons detailed below, Appellants submit that 
Claims 6-8 and 12-17 are adequately described such that one of skill in the art would recognize 
that the inventors had possession of the claimed invention at the time the application was filed. 
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L Written Description - Lesal Standard 

The well-established test for sufficiency of support under the written description 

requirement of 35 U.S.C. §112, first paragraph is stated by the Court in Vas-Cath, Inc, v. 

Mahurkar, 935 F.2d 1555, 19 U.S.P.Q.2d 1 1 1 1 (Fed. Cir. 1991): 

"Although [the applicant] does not have to describe exactly the subject matter 
claimed, ... the description must clearly allow persons of ordinary skill in the art 
to recognize that [he or she] invented what is claimed." "The test for sufficiency 
of support in a parent application is whether the disclosure of the application 
relied upon 'reasonably conveys to the artisan that the inventor had possession at 
that time of the later claimed subject matter.'" Vas-Cath, Inc. v. Mahurkar, 935 
F.2d at 1562-63, 19 U.S.P.Q.2d at 1 1 16 (citations omitted) 

2. Written Description - Burden of Proof 

The M.P.E.P. states that "[a] description as filed is presumed to be adequate , unless or 
until sufficient evidence or reasoning to the contrary has been presented by the examiner to rebut 
the presumption. See, e.g.. In re Marzocchi, 439 F.2d 220, 224, 169 U.S.P.Q. 367, 370 (C.C.P.A. 
1971)." M.P.KP. § 2163.04 (emphasis added). Therefore "[t]he examiner has the initial burden 
of presenting by a preponderance of evidence why a person skilled in the art would not recognize 
in an applicant's disclosure a description of the invention defined by the claims. Wertheim, 541 
F.2d at 263, 191 U.S.P.Q. at 97." Id. Only then does the Applicant need to respond to the 
Examiner's arguments. 

3. Written Description - Standard of Proof 

The adequacy of written description support is a factual issue and is to be determined on 
a case-by-case basis. See e.g., Vas-Catk Inc. v. Mahurkar, 935 F.2d at 1563, 19 U.S.P.Q.2d at 
1116 (Fed. Cir. 1991) (emphasis added). The factual determination in a written description 
analysis depends on the nature of the invention and the amount of knowledge imparted to those 
skilled in the art by the disclosure. Union Oil v. Atlantic Richfield Co., 208 F.3d 989, 996 (Fed. 
Cir. 2000). As with the utility requirement, the evidentiary standard to be used throughout ex 
parte examination in setting forth a rejection is a preponderance of the evidence, or " more likely 
than not " standard. In re Oetiker, 977 F.2d 1443, 1445, 24 U.S.P.Q.2d 1443, 1444 (Fed. Cir. 
1992). 
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4. The Examiner *s Arsuments 

To overcome the presumption that the claimed subject matter is adequately described, the 
Examiner must present "evidence why a person skilled in the art would not recognize in an 
applicant's disclosure a description of the invention defined by the claims. Wertheim, 541 F.2d 
at 263, 191 U.S.P.Q. at 97." M.P.KP, § 2163.04. During the course of prosecution, the 
Examiner has made essentially two arguments in an attempt to rebut this presumption. 

First, the Examiner has asserted that that the claims "are drawn to a polynucleotides [sic] 
having at least 80%, 85%, 95% or 99% sequence identity with a particular disclosed sequence. 
The claims do not require that the claimed polypeptide possess any particular biological activity, 
nor any particular conserved structure, or other disclosed distinguishing feature." First Office 
Action at 12. 

Second, the Examiner has asserted that "the skilled artisan carmot envision the detailed 
chemical structure of the encompassed genus of polypeptides, and therefore conception is not 
achieved until reduction to practice has occurred." Id at 13. 

In response to the claim amendments made by Appellants in the Amendment and 
Response to Office Action, the Examiner maintained the rejection for the reasons set forth in the 
previous Office Action. Final Office Action at 16. In addition, the Examiner repeated the 
argument that "even a very skilled artisan could not envision the detailed chemical structure of 
all or a significant number of encompassed PRO 1926 polypeptides, and therefore, would not 
know how to make or use them." Id at 17. While the Examiner does discuss Claims 4 and 5, 
nowhere does the Examiner address Claims 14-17, or explain how the above arguments apply to 
Claims 6-8 and 11-13 which are not directed to variant polypeptides. 

5. Appellants ^ Response - Rejected Claims 6 and 12-1 7 are Adequately Described 
The adequacy of written description support is a factual issue and is to be determined on 

a case-bv-case basis . See e.g., Vas-CatK Inc. v. Mahurkar, 935 F.2d at 1563, 19 U.S.P.Q.2d at 
1116 (Fed. Cir. 1991) (emphasis added). The factual determination in a written description 
analysis depends on the nature of the invention and the amount of knowledge imparted to those 
skilled in the art by the disclosure. Union Oil v. Atlantic Richfield Co., 208 F.3d 989, 996 (Fed. 
Cir. 2000). 



-56- 



Appl. No. 
Filed 



10/063,661 
May 7, 2002 



As discussed above, the Examiner has failed to specifically address any of the pending 
claims. However, even beyond addressing the pending claims as a group, the Examiner must 
address each of the pending claims on a case-by-case basis. This is because the genuses 
encompassed the claims differ, and therefore whether or not the specification supports the 
claimed genus depends on the claim at issue. Appellants hereby request that the Board consider 
the following groupings of Claims 6 and 12-17 with respect to the written description 
requirement. 

fl. Rejected Claims 6 and 12-13 are adeauately described 
Claim 6 and claims dependent therefrom are adequately described by the specification. 
Claim 6 is directed to an isolated polypeptide comprising the amino acid sequence of the 
polypeptide of SEQ ID NO: 136, the amino acid sequence of the polypeptide of SEQ ID NO: 136 
lacking its associated signal peptide, or the amino acid sequence of the polypeptide encoded by 
the full-length coding sequence of the cDNA deposited under ATCC accession number 203547. 
Claims 7-8 and 11-13 ultimately depend from Claim 6. 

As stated above, the Examiner provides no basis for rejecting any of Claims 6 and 12-13 
because the Examiner's rejection is premised on a lack of written description support for claims 
"drawn to a polynucleotides [sic] having at least 80%, 85%, 95% or 99% sequence identity with 
a particular disclosed sequence." First Office Action at page 12. 

Regardless of any reasoning that might have been provided by the Examiner, each recited 
element of Claim 6 is explicitly disclosed in the specification, either in writing {see, e.g.. 
Specification at Figure 136) or by virtue of a biological deposit. Accordingly, there can be no 
basis for holding that Claim 6 is not adequately described. Likewise, Claims 12-13 which are 
dependent from Claim 6 and are drawn to particular embodiments of Claim 6, are also fully 
described by the specification. The Examiner does not contest the written description support for 
any embodiment recited in these dependent claims. Therefore the Examiner has failed to meet 
his "initial burden of presenting by a preponderance of evidence why a person skilled in the art 
would not recognize in an applicant's disclosure a description of the invention defined by the 
claims. Wertheim, 541 F.2d at 263, 191 U.S.P.Q. at 97." M.P.E.P. § 2163.04. As such, the 
Board should reverse the Examiner's rejection of Claims 6-8 and 11-13 under 35 U.S.C. § 112, 
first paragraph, for lack of written description. 

-57- 



Appl. No. 
Filed 



10/063,661 
May 7, 2002 



b. Rejected Claims 14, 16 and 17 are adeguatelv described 

Claims 14, 16 and 17 are adequately described by the specification. Claim 14 is directed 
to an isolated polypeptide having at least 95% amino acid sequence identity to the amino acid 
sequence of the polypeptide SEQ ID NO: 136, the amino acid sequence of the polypeptide of 
SEQ ID NO: 136 lacking its associated signal peptide, or the amino acid sequence of the 
polypeptide encoded by the full-length coding sequence of the cDNA deposited under ATCC 
accession number 203547; wherein said isolated polypeptide or a fragment thereof can be used to 
generate an antibody which can be used to specifically detect the polypeptide of SEQ ID NO: 
136 in esophageal tissue samples. Claims 16 and 17 ultimately depend from Claim 14. 

Appellants maintain that there is no substantial variation within the species which fall 
within the scope of the rejected claims, which require at least 95% amino acid sequence identity 
to SEQ ID NO: 136 and can be used to generate antibodies which specifically detect the 
polypeptide of SEQ ID NO: 136 in esophageal tissue samples. As such, Appellants were in 
possession of the common attributes or features of the claimed subject matter. 

The rejected claims are analogous to the claims discussed in Example 14 of the written 
description training materials available on the PTO's website. In Example 14, the written 
description requirement was found to be satisfied for claims directed to polypeptides with 95% 
homology to a disclosed sequence that also possess a recited catalytic activity, where procedures 
for making variant proteins were routine in the art and the specification provided an assay for 
detecting the recited catalytic activity of the protein. This disclosure satisfies the written 
description requirement even though the a pplicant had disclosed only a single species and had 
not made any variants . The Guidelines state that "[t]he single species disclosed is representative 
of the genus because all members have at least 95% structural identity with the reference 
compound and because of the presence of an assay which applicant provided for identifying all 
of the at least 95% identical variants of SEQ ID NO: 3 which are capable of the specified 
catalytic activity." 

Similarly, the pending claims also have very high sequence homology to the disclosed 
sequence and must share an epitope sufficient to generate antibodies which specifically detect 
the polypeptide of SEQ ID NO: 136 in esophageal tissue samples. As in Example 14, at the time 
of the effective filing date of the instant application, it was well known in the art how to make 
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polypeptides with at least 95% amino acid sequence identity to the disclosed sequences. See, 
e.g., Specification at TUf [0256]-[0271]. In addition, the specification discloses in detail how to 
make antibodies which specifically detect a particular PRO polypeptide, and how to use them to 
detect the PRO polypeptide in a particular tissue. See, e.g., Specification [0363]-[0379], 
[0407], and [0493]-[0499]. Like a particular catalytic activity, the function of being useful to 
produce an antibody specific to SEQ ID NO: 136 is directly related to the structure of the claimed 
polypeptides. Thus, like Example 14, the genus of polypeptides that have at least 95% amino 
acid sequence identity to the disclosed sequences and possess the described functional activity 
are adequately described. 

Claims 16 and 17, drawn to particular embodiments of Claim 14, are also fully described 
by the specification. The Examiner does not contest the written description support for any 
embodiment recited in these dependent claims. 

The Examiner's arguments in the First Office Action, that the claims "do not require that 
the claimed polypeptide possess any particular biological activity, nor any particular conserved 
structure, or other disclosed distinguishing feature" are moot in light of the Claims 14-17 which 
were added after the First Office Action and require a conserved structure as detailed above. 
Likewise, the Examiner's arguments in the Final Office Action are directed solely to Claims 4 
and 5, and do not apply to Claims 14-17. 

As for the Examiner's conclusory, and unsupported statement that "the skilled artisan 
cannot envision the detailed chemical structure of the encompassed genus of polypeptides, and 
therefore conception is not achieved until reduction to practice has occurred," the basic premise 
that a large genus can not be adequately described by a single species is simply wrong. In a 
recent Federal Circuit decision. In re Wallach, 378 F.3d 1330, 1333-34 (Fed. Cir. 2004), the 
Court stated: 

[W]e agree with Appellants that the state of the art has developed such that the 
complete amino acid sequence of a protein may put one in possession of the genus 
of DNA sequences encoding it , and that one of ordinary skill in the art at the time 
the ' 129 application was filed may have therefore been in possession of the entire 
genus of DNA sequences that can encode the disclosed partial protein sequence, 
even if individual species within that genus might not have been described or 
rendered obvious. ... A claim to the genus of DNA molecules complementary to 
the RNA having the sequences encompassed by that formula, even if defined only 
in terms of the protein sequence that the DNA molecules encode, while 
containing a large number of species, is definite in scope and provides the public 
notice required of patent applicants . 
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Moreover, we see no reason to require a patent applicant to list every possible 
permutation of the nucleic acid sequences that can encode a particular protein for 
which the amino acid sequence is disclosed, given the fact that it is, as explained 
above, a routine matter to convert back and forth between an amino acid sequence 
and the sequences of the nucleic acid molecules that can encode it. Id (emphasis 
added). 

The Court did not require the applicants in Wallach to actually make or individually 
describe all of the vast number of sequences which encode the disclosed sequence. This is in 
spite of the fact that only a single sequence was disclosed, and the encompassed genus was 
enormous due to codon degeneracy in the genetic code - even the most skilled artisan could not 
individually envision the detailed chemical structure of the nucleic acids encompassed by the 
claimed genus. The Court reasoned that because it is routine to convert between amino acid 
sequences to nucleic acid sequences, disclosure of a single amino acid sequence was sufficient to 
place the applicants in possession of the enormous genus of nucleic acids which could encode 
the sequence. 

The facts in Wallach are very similar to the instant case. Here, Appellants have disclosed 
SEQ ID NO: 136, and claim polypeptides which are at least 95% identical to it and have the 
functional limitation of the ability to generate antibodies which can be used to specifically detect 
SEQ ID NO: 136 in esophageal tissue samples. As discussed above, it is routine in the art to 
create polypeptides which have at least 95% sequence identity to SEQ ID NO: 136 - it is just as 
predictable and easy as creating all of the nucleic acids which encode a particular amino acid 
sequence. Similarly, it is well within the knowledge of those skilled in the art how to determine 
which polypeptides can be used to make the recited antibodies. The predictability of this 
structure/function combination is sufficient to place the claimed subject matter in the possession 
of the Appellants, and thus the claimed polypeptides are adequately described. The Wallach 
opinion makes clear that there is no need to literally describe more than a single species to 
adequately describe a large genus where one of skill in the art recognizes that the disclosed 
species puts the applicant in possession of the claimed genus. 
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c. Rejected Claim IS is adeauatelv described 
For the reasons discussed above regarding Claims 14 and 16-17, Appellants believe that 

Claim 15 is also adequately described. However, because SEQ ID NO: 136 is 242 amino acids 
long, a polypeptide which is at least 99% identical can only have approximately 2 deletions or 
substitutions. As a result, the genus of polypeptides encompassed by Claim 15 is smaller than 
that of Claim 14, and the Board should consider the adequacy of the written description for this 
claim independently of the other claims. 

6. Written Description - Conclusion 

In conclusion, the Board should reverse the Examiner's written description rejection of 
Claims 6-8 and 11-17 because the Examiner has failed to rebut the presumption that the claims 
are adequately described, as he has failed to even address the claims at issue. And even if the 
Examiner's arguments directed to non-pending claims are extended to the claims at issue, 
Appellants submit that they have satisfied the written description requirement for rejected Claims 
6-8 and 11-17 based on the actual reduction to practice of SEQ ID NO:136, by specifying a high 
level of amino acid sequence identity, and by describing how to make and use antibodies to the 
disclosed sequence. These facts are directly analogous to those of Example 14 of the Written 
Description Guidelines published by the PTO. In addition, like In re Wallach, the description of 
the single species SEQ ID NO: 136 is sufficient to place the Appellants' in possession of the 
claimed genus because those of skill in the art recognize the correlation between polypeptide 
structure and the ability to generate specific antibodies. Appellants submit that the instant 
disclosure allows one of skill in the art to "recognize that the applicant was in possession of the 
necessary common attributes or features of the elements possessed by the members of the 
genus." Hence, Appellants respectfully request that the Board reverse the Examiner's written 
description rejection of Claims 6-8 and 1 1-17 under 35 U.S.C. §112, first paragraph. 

E. 35 V.S.C. § 102fb) Rejection 

The Examiner has rejected pending Claims 6-8, and 11-17 under 35 U.S.C. § 102(b) as 
being anticipated by Valenzuela et a/. (WO 00/55375), which was published September 21, 2000. 
The Examiner has stated that because the instant invention lacks utility, the filing date of May 7, 
2002 is considered the priority date of the instant application. 
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To be anticipated under 35 U.S.C. § 102(b), the invention must be patented or described 
in a printed publication "more than one year prior to the date of the application for patent in the 
United States." 35 U.S.C. § 102(b). Appellants submit that Valenzuela does not anticipate any 
of the pending claims because it was not published more than one year prior to the date of the 
instant application for patent in the United States. The instant application is a continuation of, 
and claims priority under 35 U.S.C. § 120 to, US Application 10/006867 filed 12/6/2001, which 
is a continuation of, and claims priority under 35 U.S.C. § 120 to, PCT Application 
PCT/USOO/23328 filed 8/24/2000, which claims priority under 35 U.S.C. § 119 to US 
Provisional Application 60/170262 filed 12/9/1999. 

Appellants submit that for the reasons stated above, the claimed polypeptides have a 
credible, substantial, and specific utility. The sequences of SEQ ID NOs:135 and 136 were first 
disclosed in US Provisional Application 60/170262 filed 12/9/1999 in Figures 1 and 2A-B. The 
data in Example 18 (Tumor Versus Normal Differential Tissue Expression Distribution), relied 
on in part for the utility of the claimed polypeptides, are disclosed in PCT Application 
PCT/USOO/23328 filed 8/24/2000, on page 93, line 3, through page 96, line 35. Valenzuela was 
published September 21, 2000. Thus, Valenzuela was not published more than one year prior to 
the filing of either PCT Application PCT/USOO/23328 filed August 24, 2000, or US Provisional 
Application 60/170262 filed December 9, 1999. The instant application claims priority to both, 
and therefore Valenzuela cannot be cited as prior art against the instant application under 35 
U.S.C. § 102(b). 

Hence, Appellants respectfiiUy request that the Board reverse the Examiner's rejection of 
Claims 6-8, and 1 1-17 under 35 U.S.C. § 102(b) as being anticipated by Valenzuela et al 
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F. Conclusion 

In view of the arguments presented above, Appellants submit that the specification as 
filed provides a specific, substantial and credible utility for the claimed polypeptides, that one of 
skill in the art would be able to make and use the claimed polypeptides without undue 
experimentation, that the specification as filed provides an adequate description of the claimed 
subject matter, and that the claims are not anticipated by the cited reference. Appellants 
therefore request that the Board reverse the Examiners rejections under 35 U.S.C. §§101, 112, 
and 102. 

Please charge any additional fees, including any fees for additional extension of time, or 
credit overpayment to Deposit Account No. 11-1410. 



RespectfiiUy submitted, 



KNOBBE, MARTENS, OLSON & BEAR, LLP 




Registration No. 
Attorney of Record 
Customer No. 30,313 
(619) 235-8550 



1924473 
091205 
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VIIL APPENDIX A - CLAIMS ON APPEAL 

1-5. (Canceled). 

6. (Previously Presented) An isolated polypeptide comprising: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 136; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 136, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203547. 

7. (Previously Presented) The isolated polypeptide of Claim 6 comprising the amino 
acid sequence of the polypeptide of SEQ ID NO: 136. 

8. (Previously Presented) The isolated polypeptide of Claim 6 comprising the amino 
acid sequence of the polypeptide of SEQ ID NO: 136, lacking its associated signal peptide. 

9. (Canceled) 

10. (Canceled) 

11. (Original) The isolated polypeptide of Claim 6 comprising the amino acid 
sequence of the polypeptide lencoded by the full-length coding sequence of the cDNA deposited 
under ATCC accession number 203547. 

12. (Currently Amended) A chimeric polypeptide comprising a polypeptide 
according to Claim ^ Claim 6 fused to a heterologous polypeptide. 

13. (Previously Presented) The chimeric polypeptide of Claim 12, wherein said 
heterologous polypeptide is a tag polypeptide or an Fc region of an immunoglobulin. 

14. (Previously Presented) An isolated polypeptide having at least 95% amino acid 
sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 136; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 136, lacking its 
associated signal peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203547; 

and wherein said isolated polypeptide or a fragment thereof can be used to 
generate an antibody which can be used to specifically detect the polypeptide of SEQ ID 
NO: 136 in esophagus tissue samples. 
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15. (Previously Presented) The isolated polypeptide of Claim 14 having at least 99% 
amino acid sequence identity to: 

(a) the amino acid sequence of the polypeptide of SEQ ID NO: 136; 

(b) the amino acid sequence of the polypeptide of SEQ ID NO: 136, lacking its associated signal 
peptide; 

(c) the amino acid sequence of the polypeptide encoded by the full-length coding 
sequence of the cDNA deposited under ATCC accession number 203547; 

and wherein said isolated polypeptide or a fragment thereof can be used to 
generate an antibody which can be used to specifically detect the polypeptide of SEQ ID 
NO: 136 in esophagus tissue samples. 

16. (Previously Presented) A chimeric polypeptide comprising a polypeptide 
according to Claim 14 fused to a heterologous polypeptide. 

17. (Previously Presented) The chimeric polypeptide of Claim 16, wherein said 
heterologous polypeptide is a tag polypeptide or an Fc region of an immunoglobulin. 
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IX. APPENDIX B - EVIDENCE 

Attached hereto is a copy of the evidence cited in Appellants' Brief. The list of evidence 



below is accompanied by a statement setting forth where in the record that evidence was entered 



into the record by the Examiner. 






Tab 


Reference 


Submitted 


Entered 


1 


Hu et al (J. Proteome 
Res., (2003) 2(4):405- 
12) 




Cited by the Examiner in the 
final Office Action 


2 


Haynes et al 
(Electrophoresis, 
(1998) 19(11):1862-71) 




Cited by the Examiner in the 
final Office Action 


3 


Chen et al (Mol. and 
Cell. Proteomics, 
(2002) 1:304-313) 




Cited by the Examiner in the 
final Office Action 


4 


Gygi et al (Mol. and 
Cell. Bio., (1999) 
19(3): 1720-30) 




Cited by the Examiner in the 
final Office Action 


5 


First Declaration of J. 
Christopher Grimaldi 


Originally submitted with 
Appellants' Amendment and 
Response to Office Action as 
Exhibit 2 


Entered by Examiner in final 
Office Action 


6 


Second Declaration by 
J. Christopher Grimaldi 


Originally submitted with 
Appellants' Amendment and 
Response to Office Action as 
Exhibit 5 


Entered by Examiner in final 
Office Action 


7 


Declaration of Paul 
Polakis, Ph.D. 


Oriffinallv submitted with 
Appellants' Amendment and 
Response to Office Action as 
Exhibit 6 


Entered bv Examiner in final 
Office Action 


8 


Bruce Alberts, et al. 
Molecular Biology of 
the Cell (3''*ed. 1994) 


vyii^iilail Y aiiuiiiiiicu Willi 

Appellants' Amendment and 
Response to Office Action as 
Exhibit 7 


Jj/lllC'iCVl ijy JLyACUlllllvl ill lllldi 

Office Action 


9 


Bruce Alberts, et al. 
Molecular Biology of 
the Cell (4''' ed. 2002) 


Originally submitted with 
Appellants' Amendment and 
Response to Office Action as 
Exhibit 8 


Entered by Examiner in final 
Office Action 


10 


Genes VI, (Benjamin 
Lewin, Genes VI 
(1997) 


Originally submitted with 
Appellants' Amendment and 
Response to Office Action as 
Exhibit 9 


Entered by Examiner in final 
Office Action 
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11 


Zhigang et al. World 

Journal of Surgical 
Oncology 2:13,2004 


Originally submitted with 

Appellants Amendment and 
Response to Office Action as 
Exhibit 10 


Entered by Examiner in final 
Office Action 


12 


Meric et al , Molecular 
Cancer Therapeutics, 
vol. 1,971-979(2002) 


Originally submitted with 
Appellants Amendment and 
Response to Office Action as 
Exhibit 1 1 


Entered by Examiner in final 
Office Action 


13 


U.S. Patent No. 
6,414,117 


Originally submitted with 
Appellants' Amendment and 
Response to Office Action as 
Exhibit 3 


Entered by Examiner in final 
Office Action 



14 U.S. Patent No. Originally submitted with Entered by Examiner in final 

6,124,433 Appellants' Amendment and Office Action 

Response to Office Action as 
Exhibit 4 
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X. APPENDIX C - RELATED PROCEEDINGS 

There are no decisions rendered by a court or the Board in any related proceedings 
identified above. 

I9S660S 
101 lOS 
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IN THE UNTTED STATES PATENT AND TRADEMARK OFFICE 

Applicant : Eaton, etal. 

Appl. No. : 10/063,557 

Filed : May 2, 2002 

For : SECRETED AND 

TRANSMEMBRANE 
POLYPEPTIDES AND NUCLEIC 
ACIDS ENCODING THE SAME 

David J. Blanchard 

1642 



PATENT 



Bcaminer 
Group Art Unit 



DECLARAT ION OF J, CHRISTOPHER GRIMALDL UNDER 37 CFR 

Commissioner for Patents 

P.O. Box 1450 

Alexandria, VA 22313-1450 

Dear Sin 

1, J. Christopher Grimaldi, declare and state as follpws: 

1. I am a Senior Research Associate in the Molecular Biology Department of 
Genentech, Inc., South San Francisco, CA 94080. 

2. My scientific Curriculum Vitae, including my list of publications, is attached to 
and forms part of this Declaration (Exhibit A). ' 

3. I joined Genentech in January of 1999. From 1999 to 2003, 1 directed the Cloning 
Laboratory in the Molecular Biology Department. During this time I directed or performed 
numerous molecular biology techniques including semi-quantitative Polymerase Chain Reaction 
(PCR) analyses. I am currently involved, among other projects, in the isolation of genes coding 
for membrane associated proteins which can be used as targets for antibody therapeutics against 
cancer. In connection with flie above-identified patent application, I personally performed or 
directed the semi-quantitative PCR gene expression analyses in the assay entitled "Tumor Versus 
Normal Differential Tissue Expression Distribution," which is described in EXAMPLE 18 in. the 
specification. These studies were used to identify differences in gene expression between tumor 
tissue and their normal counterparts. 

4. EXAMPLE 18 reports the results of the PCR analyses conducted as part of the 
investigating of several newly discovered DNA sequences. This process included developing 
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primers and analyzing expression of flie DNA sequences of interest in nonnal and tumor tissues. 
The aiialyses were designed to determine wheflber a difference exists betwera gene expression. in 
nonnal tissues as compared to tumor in the same tissue type. 

5. The DNA libraries used in the gene expression studies were made from pooled 
samples of normal and of tumor tissues. Data from pooled samples is more likely to be accurate 
than data obtained from a sample from a single individual. That is, the detection of variations in 
gene expression is Iflcely to represent a more generally relevant condition when pooled samples 
from normal tissues are compared with pooled samples from tumors in the same tissue type. 

6. In differential gene expression studies, one looks for genes whose expression levels 
differ significantly under different conditions, for example, in normal versus diseased tissue. 
Thus, I conducted a semi-quantitative analysis of the expression of the DNA sequences of 
interest in normal versus tumor tissues. Expression levels were graded according to a scale of - 
, and +/- to iijdicate the amount of the specific signal detected. Using the widely accepted 
technique of PGR, it was determined whether the jpolynucleotides tested were more highly 
expressed, less expressed, or whether expression remained the same in tumor tissue as compared 
to its nonnal counterpart. Because this technique relies on the visual detection of etfaidium 
bromide staining of PGR products on agarose gels, it is reasonable to assume that any detectable 
differences seen between two samples will represent at least a two fold difference in cDNA, 

7. The results of the gene expression studies indicate that the genes of interest can be 
used to differentiate tumor from normal. The precise levels of gene expression are inelevan^ 
what matters is that there is a relative difference in expression between normal tissue and tumor 
tissue. The precise type of tumor is also irrelevant; again, tjie assay was designed to indicate 
whether a difference exists between normal tissue and tumor tissue of the same type. If a 
difference is detected, this indicates that the gene and its corresponding polypeptide and 
antibodies against the polypeptide are useful for diagnostic purposes, to screen samples to 
differentiate between normal and tumor. Additional studies can then be conducted if further 
information is desired. 

8. I hereby declare that all statements made herein of my own knowledge are true and 
that all statements made on information or belief are believed to be true, and further that these 
statements were made with the knowledge that willftjl false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that such willful statements may jeopardize the validity of the application or any 
patent issued thereon. 
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0 . 0 

J* Christopher Grimaldi 



1434-36* Ave. 

San I^andsco, CA 94122 

(415) 681-1639 (Home) 



EDUCATION University of Califoxoia, Berkeley 

Badielor of Arts in Molecular Kology, 1984 

EMPLOYMENT EXPERIENCE 

GenentBch Inc., South San Francisco; l/?9 to present 
rw^^r -^1^ jf"®^ ^® ^^^f Antigen (TAP), and Secttted Ttimor Pnirft, "* 

?ovel genes discoveL^u^'^S^TS cST^^^ ^« ^^^"^ 

iniRtemented and patented hi^ ti^^ 

essential fortheisLtion of h^dr^S^elS^A^^ 

weU as dozens of othw smaUe? projects. ^ ^^^'^ ^'""^ ^ 

Scientist DNAX Research Institute, Palo Alto; 9^1 to 1/99 

fcvolved injgnli^iae pioject&aitifed-attmdersbmding novel genes discovered tift»ii»h 

Facilities 

^^^Ser Corixa, Redwood City; 5/89 - 7/91. 

Wrec^ plant-refa^ed activities, which included expansion planning, umintenance safetv 



SRA University of Califtmua, San Fhmc^ . 

Cancer Research Institute; 2/87-4/89. 

Research 

Technidan Berlex Biosciences, South San Fiancisco; 7/85-2/87. • 

ScterSigSSSSesS^K^ 

vectorforJbyoth^Sti^S; ^ Also cansfnicted a genial purpose expr^^oi 

pubucahons 

' a Unique Acvl-CoA ^^^JZ't^^ • 2?* ^^^'^ A. Lewuj, & Steven CoIman'^FIT. 
or^Sono^^^h^S?^.^^^ Caonig, 
Jc^^^slo l3t?S^S '"""^^ BiochenS' 

by Wnt.l andRl&SL^S— 

^' JS^l^/^Jf'V^^'^^ Bryant, Gordon Vehar 

pilu A Ifi' P»^Pher Grimaldi (incorrecUy named as «Grinial<fi cf'V P™nlii« 
Peale Apama Draksharapu. David A. Lewin, and Mary B. oLiW^^S fo^^ 



5. 



leukocyte Biolo^vl^?^SS?a^r^^ chemobne realtor 3 (CCR3). Journal of 



o 



9m 1324-33; S ^ "^"^ nsiKHKes.- Blood Vol. 

. i5T^9^TlS An„™o»g«laU«yectoenzyme- Lnmuhology Todar^ 
B CeU Acdvation^J^ H^^^?:S?Sf 

151. 31 11-3118. 1993 of Immunology, Vol. 

15. D^^nd J. Rawlings Douglas C Saffiran, Satoshi Tsukada. David A. Lar^aesDada. J 
Omstopher Grimaldi, Lucie Cohen Randolph N. Mohr J Fema^o^r^i^ 



o o 



^^'^^f "SmaU^caleLambdaDNAPrep." Contribution to 
Cmtent Protocob in Molecular Biology. Supplement 5, Winter 1989 

L^S^^^f ?^ "^JT*? ^ ""^^ *<5a4) Chromosomal Ttansloc^on in a 

Acute Lymphocytic U,fcmia Joins the mterteddn-S Gene to the faununoglobX 
Heavy Chain Gene." Blood, Vol 73, 2081-2085, 1989 ^&»o»aa 

^''^^l^'?^'\^^^?f^f'^ "AnAdditionalBrealqpoiotResioniaIhe 
BCL.1 Locus Assoaatoi with the t(ll;14) (ql3;q32) Ttanslocation of B^nnphoSlic 
Mahgnancy." Blood, Vol 74, 1801-1806. 1989 ^/"pawync 

Sli^J^S-^^^^V-^^^Pl^^rC^ "LackofDetec«aM» 

■ ' lSLSJTST^-^^"/ of the Ig H.Chain Gene of a Bmnan 

Lymphocytic Leukemia." The Jomm of Immns516gy, VoL 141. 3994-3998, 1988 " 

MANUSCRIPTS PREPARATION 

1. ^«mB^asnbramanian.L Christopher Grimaldi,LFa^ 

M^ue^Howacd. JStHK^uial and fonctional characterization of CD38:StSr^ 
active site lesidues «*»v««w vx 

PATENTS 

1. ♦'Methods for Eosinophil Depletion with Antibody to CCR3 Rea^toi^ (US 6,207,155 Bl). 

2. "An^lification Based Ooning Method." (US 6,6(y7,899) 

the Same. patent «>veisseyerall^p^gei]^) , _ <, . 

.V 

4. *1L-17 Homologous Polypeptides and Therapeutic Uses Hiereor 

5. ^Method of Diagnosing and TYeating Cartilaginous Disorders." 
MESMBEKSmPS AND ACllVmES 

Editor Frontiers in Bioscience 

Member DNAX Safety Committee 1991-1999 

Biological Safety Affairs Forum (BSAF) 1990-1991 
^viroxunental Law Foundation CBLP) 1990-1991 
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DECLARATTON OF .T. CHRTSTOPHRR r^PTMAl J^l , tiNDEp ^7 i» ^ 1 



Commissioner for Patents 
P.O. Box 1450 

Alexandria, VA 22313-1450 

DearSir: 

1, J. Christopher Grimaldi, declare and say a^ follows: 

I ; I am a . Senior Research Associate in the Molecular Biology Department of 
Oaientech, Inc., South San Frapcisco, CA 94080. 

2. I joined Gei^entech in January of 1999. From 1 999 to 2003, 1 directed the Qonine 
Laboratory m the Molecular Biology Department During this time I directed or performed 
numerous molecular biology techniques including gualitative Polymeiase Chain Reaction (PGR) 
analyses. I am currently involved in, among other projecJts, the isolation of genes coding for 
membrane associated proteins which can be used as targets for antibody therapeutics against 
cancer In comiection.wilh the above-identified patent application, 1 personally performed or 
directed the semi-quantitative PGR analyses in the assay entitled 'Tumor Versus Normal 
Ditterential Tissue :Expression Distribution" which is described in EXAMPLE 18 in the 
specification that were used to identify differences in gene expression between tumor tissue and 
tneir normal counterparts. 

^ 3. My scientific Curriculum Vitae, mcluding my list of publications, is attached to 
and forms part of this Declaration (Exhibit A). 

A-^ ^" x- gene expression studies, one looks for genes whose expression levels 

differ significantly under different conditions, for example, in liormal versus diseased tissue. 

1 
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Chromosomal, aberrations, such as gene amplification, and chromosomal translocations are 
important maikers of spbcific types of cancer and lead to the aberrant expression of specific 
genes and fheff encoded polypeptides, including over-expression and under-expression. For 
example, gene amplification is a process in which specific regions of a chromosome are 
duplicated, thus creating multiple copies of certain genes that nonnaUy exist as a single copy. 
Gene inider-exJ)ression can dccur when a gene is not teanscribed into mRNA. In addition, 
chromosoinal ti^locations octur when two different chromosomes break, aiid are rejoined to ' 
each other chromosome resulting in a chimCTic chromosome which displays a different ejqiression 
pa^m relative to tiie parent chromosomes. Amplification of certain genes such as Her2/Neu 
[Singleton c/., Pat|iol. Annii.,- 27Ptl:165-190], or chromosomal translocations such as t(5;14), 
[Grimaldi er izi.filoo^ 73(8):2081-2085(1?89); Meeker e^ al. Blood. 76(2)i285-289(1990)3 give 
cancer cells a growtili or sutvivd advantage relative to normal cells, and might also provide a 
mechanism of tumor cell resistance to chemotherapy or radiotherapy. When the dnompsomal 
aberration results in the abenant expression of a mRNA and tiie corresponding gene product (the 
polypeptide), as it does in the rforemraitioned cases, the gene product is a promising ta^et for 
cancCT thers^y, for exanq>le, by ttie iiier^eutic antibody approach. 

5. Comparison of gene expression levels in normal versus diseased tissue has 
important implications both.diagnostically and therapeutically. For example, those who work in 
this field are well aware that in the vast majority of cases, when a gene is over-expressed, as 
evidenced by an increased production of mRNA, die gene product or polypeptide will also be 
over-expressed. It is unlikely that one identifies increased noRNA expression without associated 
. incre^ed protein expression. This same principle applies to gene uhder-expression. When a 

gene is under-expressed, the gene product is also likely to be under-expressed. Stated in another 
way, two cell samples which have differing mRNA concentrations for a specific gene are 
expected to have correspondingly different concentration of protein for fliat gene. Techniques 
used to detect ntjRNA, such as Northem Blotting, Differential Display, in 5i/m hybridization, 
quantitative PGR, Taqman, and more recentiy Microarray technology all rely on the dogma tiiat a 
clwnge in mRNA will represent a similar chmge in protein. If this dogma did not hold true tiien 
Ihese techniques would have little value and not be so Widely used. The use of mRNA 
quantitation techniques have identified a seemingly endless number of genes which are 
differentially expressed in various tissues and these genes have subsequenfly been shown to have 
correspondingly similar changes in their protein levels. Thus, the detection of increased mRNA 
expression is expected to result in increased polypeptide expression, and the detection of 
decreased mRNA expression is expected to result in decreased polypeptide expression. The 
detection of increased or decreased polypeptide e3q)ression. can be used for cancer diagnosis and 
treatment. 

6. However, even in the rare case where the protein expression does not correlate 
witii the mRNA expression, this still provides significant information useful for cancer diagnosis 
and treatment For example, if over- or under-expression of a gene product does not correlate 
with over- or under-expression of mRNA in certam tumor types but does so in others, tiien 
identification of both gene expression and protein expression enables more accurate tumor 
classification and hence better determination of suitable therapy. In addition, absaice of over- or 
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under-expression of the gene product in the presjence of a partipular over- or under-expression of 
mRNA is crucial information for the practicing cliniciait . For example, if a gpne is over-expressed 
but the corresponding gene product is not significantly ova--expressed, the clinician accordingly 
will decide jyot to treat a patient with agents that, target that gene product 

. 7. I hereby declare that aUstatement^mstdeh^in of niy ov^ 
that all statenients made on informatioh or belief are believed to be true, and ftuther that' these, 
statements were made with the knowledge that .willftd false statements and the like so made, are 
punishable by fine or imprisonment, of both, under Section 1001 of Title 18 of the tJnited States 
Code and that such willful statements may jeopardize the validity of the application or any 
patent issued thereon. 



By: 


h 






Chj 


istopher Grimatdi 



Date: 
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J. diristopher Giimaldi 

1434-36* Ave. 

San Francisco, CA 94122 

(415) <?81.1(»39 (Home) 

EDUCATION University of Califomia. Berkeley 

Bachelor of Arts in Molecular Mology, 1984 



EMPLOYMENT EXPERIENCE 

Genentech Inc., South San Rancisco; 1/99 to present 

fte^ously. was.responsible direct and manage (he Cloning Lab. Conently focused obl 

l^unor Antigen CrAF)'and Secreted tLSoS 
CSrOP) proj ecte for the Oncology Department as weU as linmunologicaBy relevant S^for th^ 

P'*^*^*"' P°^°^ therapeutic use (SPD?). FoX STrSct 
my duties vrere, among other things, the criticaUy importaTcoordination of the clo^of 
ftousands of putative genes, by developing a smooth process of commuiSS^S^Le 
Biomformatics. Qoning, Sequencing, and Legal tean^. CoUaborated vriSb^Trdt^rs to 
^« novel genes through the Curagen project, a uxuque differential S^^ky mSoLS, 
Interacted extensively wiflr dre Legal team providing esiatial dat9 needed fw S^^^. 
iiovel genes discovered through the SPDL TAP and Curagen Fq^te^^uSSlS. 
mq.lemented and patented high throughput ctontogmSoIo^^ 

Scienlist DNAX Research lostitute, Palo Alto; 9/91 to 1/99 

biomfonnatics studies and functional assays. Developed and patented a method fortfie <^ifu^ 
depletion of eosiiiophils ur vivo usmgmonoclonalaS^ 

essential technical a«thodologies and provided strategic direction in the a^ ^eSS 

Sed'Stl'*"^'*'^"' generalmolecularbiology. and monoclonal antibody pSSn. 
itained and supervised numerous technical.staff. / i'l^uu^uun. 

Facilities 

Manager Corixa, Redwood City; 5/89 - 7/91. 

Directed plant-related activities, which included expansion planning, maintenance safetv 
p^chasmg, mv^tory control, shipping and r^iviSg. and laborato^Z^S. SSked 
and implanented the safety program. Also served as Uaison to regubtory ienS aTSS 
Jtote and federd level. Was in charge of property leases. leasehoEpSvSS^eto 

personnel to carry out the above-mentioned duties. "uuwpwvisea 



Q 



SRA UmveratyofCa]ifoniia,SanFtancisco 
Cancer Research lostitatc^ 2/87-4/89. 

Sff^fS T^^"^^ <^»o°«^g projects includmg: studies of somatic hypermutatioii 
Stages of AIDS-associated lymphomas, and cloning of K5;14), t(ll;14), andS 

.sirrin^si:pS:^^ 

Reseaich 

Technician Berlex Biosciences, South San Frandsco; 7/85-2/87. 

'^^^nJ^^"'^^^^''^^''^^'^' ^'^g wiihdegeneioligonucleotidrand 
^ShroSS^^J^."'-"^ Alsocbnstmctedagene^seex^on 

PIJBUCATIONS 

*'^«SecretedProtemIMscoveiyImtiative(SPDD.aI^scale 
^r;*°^^°^£^°^^^H'««^SecietedandTiansmembia^ 
assessment" QenomB Res. VoI 13(10), 2265-2270, 2003 ««inionnaiics 

^ SJnfil^' ^'^'^'^^ ^8 Xian Yu, Audrey D. doddard. J. 

1 ^ I^'"*. I>avid A. Lewin. & Steven Co W "BHT 

• SS. h ^'"""^ ^ Thermogenic Brown Adiopose Tissue: Sg. 

3. SzetoW.riMgW.Tice DA. RubinfeldB,Hol]ingsheadPG, Pong SE, Dagger DL PhamT 
^Zl'IZ''^?^'^' ^^^P^^^' Singh JS.Fran.lGDKE^^ STi^ 
SfSA^?^'*^^'^^^^'^*''^^^*^^'^^^ «OveifcxBi4sionof 
fte Retenoic Acid-Responsive Goie Stra6 in Human Cant^ and its Synergistic^v2irai 
by Wnt-1 and Retindc Add." Cancer Research Vol. 61(10), 4197-4^^1 

JSt^h'^ii^JS:^^^^^' ^^^^^ ^ele. Xiaohua Xin. JuUet E. Bryant, Gordon Vehar 
Jm Schoenfeld^r. Chnstopher Grimaldi (incorrecdy named as "Grimal(tt, 6") F^diT 
Peale. AparnaDr^haiapu,DavidAUwin,andMaiyB.Gerrits^^^ "Gene^kpS^ 
SSSS Tm ^"^^ of ABgiogenesis." American Journal of FaAolo^'w T^^(6). 

WG, Wu X, Soto H, O'Gaira A, Howard MC. Coffman RL. 'Depletion of eosinoohils in 

mcedurougi the use of antibodies specific for C-CchemokineiJcep^r 3 (S^?^ 
Leukocyte Biology; Vol. 65(6). 846-53, 1999 «cepior j tcCK3). Journal of 

^' S^S^ff '"^r^^^^^^y"^- •1°<'-P«°dendyligatingCD38andPc 
ga^uaRUB. relays a dommant negative signal to BceUs." Hybridoma Vol. 18(2), 113-9. 



o 



o 



9. 



10 



SSl^ cl: ^'^^ ^^^1 T. Griinaldi JC, MuUer-Stef&er H. RandaU TD, Lund FE 
MuiiayR.ScliuberF,HowaidMC. 'Mcedeficiemfortheect<>-nkotinamii^nbk 
5u Tst stS'r'^ CD38 exhibit al..^ inunr^^^-fSr Vol. 

Frances ELund^NanetteW^ 

Gtunaldi, Ttoy D. RandaU, R. M. E. Parichonse. Christopher C G^w^M^n 
^seus. *Mm>peanJoiinrahof ifanHinDto *^ 

"A new appioach to the study of haematopoietic devel^n^ ^ yolk 8ac 
and embiyoid body." Development. VoL 121(10). 3335^346. 1995 " 

J. Christopher Grimaldi. Sriram Balasubramanian. J. Fernando Bazan. Amien ShanafelL 

• ^Pp^siv^ess of xidB cells impUcates Bmton's ^osinSTiSr^t'ieSrof 
CD38mduced signal transduction," International Inmimiology. Vol 7(2). 16M7^^^ 

^^"-^^"^.t'T^ ""^T" Maureen 
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THe t(5}14) Chromosomal translocation in a Case of Acute Lymphocvtic 
Xeokemia Joins the Interleukin-3 Gene to the Immttnoglobnlin Heaty Chain G-ene 

By J. Christopher Grimaidi and Timothy C. Medcer 



Chromosomal translocations have proven to be. Important 
niarkers of tho senetio abnormalities central to the patho- 
genesis of cancer. By cloning chromosomal breakpoints 
ono can MentHy abthnited proto-oncogenes. We have stud- 
ied a case of B-flneage acute lymphocytic.liBukomla CAIL) 
titatwaa associated with peripheral blood eoslnophllla. The 
rtwomosomal translocation t(Bn4) (q31»|32) from tMs 
-^Bawpirwws-e to weJ an d «midte J^tthw molecular I wre h'TWy 

KARYOTYPIC STUDIES of leukemia and lymphoma 
have identified freqiiient nom^andom chromosomal 
tnuiOGatfoos. Some ot these tianslocations juxtapose the 
innmnoglobuBn heavy diain (IgH) gem with important 
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1- DNA bkrts of the leukemia eemple. The restriction 
tragment pattern of itormel human DNA (N) and the leukemia 
sample (LI were eompared utf a human Jh probe. Rearranoed 
bandt are Incfieated by arrowa. Sampfa L exhlMta a afaifile rear- 
ranged band whh both Hrmfin/EicMII and SaiaA reetrictlofl 
dtgesta. The rearranged banda are leas hitense than the other 
bande because the majortty of calle In the oampta repreaem normal 
bone marrow elements. 



tranislocation loined the bYunimogiobuIln heavy chain J<rfn- 
fng (Jh) region to the promotor region of the intarleiikin-3 
(IL-3) gene In opposite tranecrlptlonal orientatioiw. Tiie 
data augge«t that acthration of tho It3 ^ona by the- 
enhancer of tho bnrimuhoglobunn heavy chabi gene may play 
a central rols in the pathogeneds of tMa leukehila and tho 
associated eoslnophilla^ 

"» n89 ty O n ms - etS i mtto nrinK' 

protooncogcnes, such as o-myc and bcl-Z}^ In tliis way, the 
IgH gene can activate proto-oncogencs, rtsnlting in disor- 
dered gene expression and a $tep in the devdbpment of 
cancer.Thehiv^tigation of additional nonrandom tiansloca- 
tions into the IgH locus allows tis to identify new gents 
ptomoUng tiie penmtlon of Icnlt^a and Jwphoma* 

A distinct Subtype of acuterlyiaphooytic leokcmia (ALL) 
bas been characterized by B-lincage phcnotypc, associated 
eosinophEQa in the pwipheral blood, and a t(5;14)(q3I;q32) 
cbromosomat translocation.^ This syndrome probably 
occurs in <l% of all laUents with ALL. We hypothesized 
that the cloning of the translocation chaxacteristic of this 
leukemia nught allow the identificaUon of an important gene 
dn chronusome 5 that plays a role in the evolution of this 
disease. In this report we demonstrate that the interleuldn-3 
gene (IL-S) and the IgH gene are joined by this transloca- 
tion. 

MATERIA!^ AND METHODS 

Sampie and DNA blots. A bone marrow aspirate from a repre- 
sontoUve patient with ALL (LI morphology by French-American- 
British [FAB] criteriB)» peripheral cosinophilia (up to 20,000 per 
microliter with a normal value of <350 per microliter) and a 
t(S:14)(q3l;q3Z) translocation was studied. Using published meth- 
ods, genomic DNA was isolated and DNA blots were made.' Briefly, 
10 |ig of Ugh molccolar wdfi^t (mol wt) DNA were digested usbg 
^appropriate restricUon enzyme and dectrpphoresed on a 0.8% 
*a»H6s6?^ The gel was stained ^ ethidhmo bromide, photo- 
graphed, denatured, neutralized, aA<rtransferred to Hyb(»d (Amer- 
sham. Arlington Heights, IL). After txtatment of the filter with 
ultraviolet Hght. hybridjzaUon was pcrfomied. Tho filter was washed 
to a final stringeaqy of 0.2% saturated sodium <alraie (SSC) and 
0,1% sodium lauryly sulfate (SDS) and exposed to film. The human 
Jh probe has been previously reported.* 
Genomic library. The genomic library was made using pub- 
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Fragments from 9 to 23 kOobases (kb) ia .to w«o bobtodon. 

Ml 3 wcton airf seqoenccd liy the diain tmninaUoa mctbod usinx 
data wen denred from totkstiasds. . •^u«iw» 

We studied ft bone nuurow sample from a patfent with 
ALL and associated perii^era] eoslnopWlia. Karyotypic 
analysis diowed theeha«i0t^tict(5;14)(q31;q32) tamsic 
MtiWL These features define a distinctive subtype of ALL.** 
The leukemic cells analyzed for ceU soiface phenotype 

W (PDI9) cALLA (CDIO). HLA-DR. and lemtai 
feoiyBUcfcotldyi transferase CTdt), but negative for surface 
munuao^bulin. This phenotypic profile describes an fahina- 
tme cdl from the B-Iyniidio<9tic lineage.* 

The leukemia DN A was analyzed by Southera UaObs tot 
rearnuigements of the IgH gene. Usibg a hmnaa iomuao- 
m I smeteTeananged band was detected by 
Bcoia, Hbtdm, SsO, SmaA, and «coRI phis Hbtdai 
.TOtnrtion digests, suggesting rearrangement of one aUele 
(Rg 1). The imnmnoglobiilin Jh region from the other allele 
presumably either deleted or in the germUne configura- 

We hypothesized that the t(5;14)(q3l?i32) juxtaposed a 
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gnwftrpronioting gene on chromosome 5 with the iaa^ 

globulin Jh region on chromosome 1 4. Therefore. 

hbraiy was made from the leokunic sample and 

with a Jh probe. Fifteen distinct positive dones wereSS 

and screened for the presence of the rearranged Sau^ 

fragment that was detected by DNA bSr^^ 

aiudy^ live clones appeared to represent the reaSaSj 

alleIeidenufiedbyDNAblots.OueofthesecIpnes(cES^ 
4) was chosen for farther study and a detdOed «stiirtS 

fra^ents from clone na 4 that hybridized to the human Jb 

from the leukemia sample,. conflrminR that cIone^T 4 
"presented the rearranged leukemic aBelfc . 
!«Sf ^K**?^*^ * contained 3.7 Icb of unknown origia 
jomed to the IgH gene in the w^on of Jh4 (F« 2). TT»o 1^ 
gene from Jh4 to the Cmuregion appeared tot, b gwiuS! 
oonfi^miiion. Previously, the gene ^nlmg hemaSS^*- " 

9Wth factor a.3ha4beenmappedto^roLsomei^ 
^^W^^^""?"? BO. 4 might contam partlf 
gtt» when the restncuonmap of human tt-S and done 
nMweiecompared. they were identical formore than 3 S 

We conlirmed the juxtaposition of the IL-S geno and tho 
fl^^Rnm '7 ''^^ sequencing of the suDdonS 

BstEniHpal fragment (Kg 2). The sequence of this fra^ 
mentshowed no disruptionofthe protein codmg region orX 
^^RNAofthen^Sgene-Thebreakm^tliSj^S^^^ 
oc«nred m the promoter region. 452 baie pairs (to) 
upstream of the transcriptional start site (positiwM, 
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T(5;l4)CHH0MOSOMALTRANStOCA.'nON ^ 2083 

The break m the IgH gene occorred. 2 bp upstream of GM-CSF maps within 9 to of IL-3 in the same transcrip* 

the ih4 regbn. Between the two breaks, 25 bp of uncertain tional oiientatioii*'^ Using this information and assmning a 

oriBUt(putativeNsequence) weretnserted^^Noseqaenoes simple translocation event in our sample^ we can conclude 

homologous to the immunoglobulin hcptamer and nonamer that the IL-3 gene is normally more centromcric, and the 

could be identified in the IL-3 sequence (Fig 3B)» Therefore, QM-CSP gene, more tdomeric on <^mosome 5q {JTig 4). 

nucleic acid sequendng confirmed the juxtaposition of the Furthermore, both are transcribed with their 5' ends toward 

IL-3 gene and the IgH gene. The sequence data clearly the centromere, 
showed that the genes were positioned in oppodte transcrip- 
tional orientations (head-to-head). DISCUSSION 

— - Av aila Mf L data jJaa. sBowed na tp^d^ern^ne the nonoal - _In this report we have cfoned a unique chromosomal 

positions of the ILr3 gene and the GM<:SP gene in lelaticm /tranSocafion'fliEa'app^^ - 

to theoentromefeof chromosome 5 (Rg 4). The IgH gene la y^ distinct, dinical form of acute leukemia. This transloca- 

knowa to be positioned with the variable re^ons toward the tioq joined the pr(»notor of the 11^3 gene to the IgH gene, 

tdomereon chromosome 14q.^^ It has also been shown that Bxcept for the altered pfomotor, the IL-3 gene appeared 

• + • ' 

A 5 • G<3TGACCAGGCTTCCCTGiXax:AGTAGTCAAAGTAGTAi^^ 
3 'CCACTGGTCCCAy;a3ACC€GGGTCATCA(^^ ^« 

5<TACCA6ftCAMCBCTCATbSGTT^ 
3*ATCGXCTC!raTGAGAGTAGACAAG(3XCA^ 

, ********* ^ 

5" GTAGTCCAGGTGATGGCAGA1X»GATC<X:ACIG<X3CA6GAGGCC^ ... 
3*CATCAGGTCCACTACCGTCTACTCTAGGCTOACX^^ 

S'GGGGTCCTCTCACCTGCTGCCATCCTTCCCATC^ 

3*CCCCAGGAGAGT6GACGA06GXAC6AAGGGTA6AG2^GTAGGAGGAACT^^ 

• • ... *J^******* 

5»TaTCTTCTTTCACTSAax:!nWAGTACTAGAAAGTCA^ 
3*AAAGAACAAAGTGACTAGAACTCATCATCTTTCA6TACCTACT£ATTAA 

5 •CA6ATAAA6ATCC™CGAC6CCTGCCKX:aCACCACCACCTCC^^ . - _ 

3 • GTCTATTTCTAGGAAGGCTGCGGACGGGGTGTGGTGGTGGAGGGGGGCGGAACGGGCCCCAACACC^^ * 

5 * CACASAS^GGCGGGAGGITG^l^TGCCAACVC^CaC&GAGCC ... 
3*GTGVATATT0CGC€CTCCAACAACGGraGAGAAGTCTCGGGGTGCl^ dOb 

5 ' CCAAAqjSgAGCCGCCTGCCCGTCCTGCTCC^ . ^ _ 

3*GGX7TGTACTCGGCGGACGGGCAGGACGAGGACGAGGTTGAGGACCAGGCGGGGCC^ «4L 

5*AACGTCCTTGAAGACAAGCT6GGmAC 3" ..^ 
31TTGCAGGAACTTCTGITCGACCCAATTG 5' 

BTaJhA S'TGGCCCCAGTAGTCAAAGTAGTCACATTGTGGGAGGCCCCATO'AAGGGGTGCACAAAAACCTGACTCTC 
3 • ACCGGGGTCATCAGTOTCATCAGTGO^AACA CCCTCCGGGGrAATTCCCCAC GTGTTTTO 

▼ TTTTTf TTTTTTT'TTTTTTTT 

5 ' TGGCCCCAGXAGTCAAAGTAGTAGAGGTAATTCJ^TC ATAGCTGCGGATTAGC AGCGTGACCGG^^ 
3 ' ACCGGGGTCATCAGTTTCATC ATCTCCAOTAAGTAgTATCGAC^^ 



CI, #4 



5 * GGC&CCAAGAGATQTGCTTCTCAGAGCerGAGGCTGAACGTGGATGTTT 

3 * CCGTGGTTCTCTACACGAAGAGTCTCGGACTCCGACTTGCACCTACAAATCGTCG^^ 



Fig 3. SofiUQnce of ttU;1^tq31;c932l breakpofan region. (A) Nuctootldo eequenc* of th* BstEXMHpA fragment indicated on Fig 2. 
NiitltotMoa 1 to 3d repretant the Jh4 coding region imderBnad on tho coding airand/ Nudaotidas 39 to 63 ara a putatWa N region. The 
ae(}ii«nea from petition 64 to 668 la that of the oarmllna n^^ gana.** Tha n.-3 TATA box (495). tramcrfption atart (616), and Inhtotion 
mathkmlna (567) are undartlnad. Two propoaad regulatory ooquenoet In the promotor ara markad by a«tori9ka(po»Hl€na 182 and 389). (B) 
Comparative aequenca of the t(B:14}(q31 ;ff32) breakpoint region. The lgJh4 region la shown with ha codbtg ra^on, heptamar« and 
nonamer undartlnad. dona no. 4 la ahown with putativa N region aaquoncaa undarflned. Tha tL-3 aaquanoa la also shown. A |to sign ( + ) 
denotes the Identical nudeotido batwaan aaiiuancos. No haptamor or nonamar Is Mantiliad hi tha 1.-3 soquenea. 
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«M»9nwo»tl»«r«»loc8lIo^Th«iioniiJdi;^^ 
«»««• nwito to th. beaiMo^ead ortentattm of these gene*. 

intact as no deletions, insertions, or point mutations were 
detected by restriction mapping of the entire gene and 

!^ri"^M'1?/^ 8*""- '^'^ «««« 

ttmncated at the Jh4 region, which places the iminiuiodobu- 
in enhancer within U kb of the 11^3 gene This leads to 
&e hypothesis that the enhancer is increasing transcription 
t^^^^T^'^ The same mechanism is 

mportmt for activation of the ^myc gene in some cases of 
rfSlll 'ymphoma « An alternate hypothesis is that the 

a«!^?^''f''**°"*^**"'5 8''»''»''«8"tsthatan 

** ^ pathogenesis of this 
teukemla. OveMxp«,|oa of the IL-s geirt coupled with 
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Uioprescnceof th»IL-a receptor in these cells could aocoDiit 
for a strong stimnhis for proliferation. In this regardTth^I 
are date indicting that immature B-Uneage lympfcoSte 
and B-Un«ige leukemias may express the IL-3 receptbr«» 
An additoonal. feature of this type of leukemia the 
dramatic eosinophilia. consisang of mature fonns. it ha. 
be« hypothenzed that the eosinophils do not arise from Jbo 
"Mignaat done, but are stimulated by the tumorSS 

^iiseofthefcrowBeflFertQfn,3one«isiaophiJdiff«r«,;«a. 
*H>n«retionxithiglrteveto^^ VsmmcS^^^ 

have a role b the eoalnophiUa in this typeof IeolKinia.«^ 
TtodmsTOtthataetecfflBbinatloBiiwchai^ 

oactoefa thelgH gcnednring normal differenSSTha^^ 
j2»inthi»translo«^ 

pomt loeatfon at the y end of Jh4 and the prese^^ 

putetwe N-region sequences. On th* other hand, no rewteW. 
nahonsj^ sequdice (heptamer and nonamer) wa. found' 
« this region on chromosome S^'suggeting that addit&MS 
factors also played a role. Father ^a>^ 
mechanism of this and other transloeatioas 

gobuhn eiOjaiicer also acthwtes the GM^F gene, sinc^ 
^gei»ei»pKtehftrpodtioneddnly 14kbaw8y(^).^ 

mterleiAm^S (fL-S) gene maps to chi«mo«»i» 5,31? 

synensstjcally with IL-S in the stimulation of eoeinopM 

^h^tlonanddBff«entiation.»T^ 

wJl be answer«l by the study of more patient samples; We 

pfan to det«mme whether the t(5;U)(q31;q32) t^8lo«|! 

tion « capable of acUvattag multiple lymphddnes sim2S- 

toSa * ^ cooperate in the geaeratioB of iWs 
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Activation of '^^^^^^^^^ by Chromosome Translocation in Acute 

Lymphocytic Leukemia With EosinophiUa 

By Timothy C. Meeker. Dan Hardy, Cheryt Willman; Thomas Hogan, and John Abrams 



The t(Bi;14)(q3l:q32) translocation from B^eago acute 
^piioeytio leukemb with eo^nophOia haa been Ooned 
from two leukemia aamplea. In both, caaea, m« tianstoca* 
tkHi Joined the I9H gene ai>d the lnterleukin-3 ClW) gene. In 
- yif patient, oxeeaa mBMA 1^ pr»AM>^ 
leuketulo coUa^ ln the second patlem; aenim M levels ' 
waro meaaured and shown to oorreiste with (Saeaae 

A NUMBER OF chiomosooie traiislocatioDS have been 
associated whh human leukemia and lymphoma. In 
mxsy cases the stwty of theso translocations has led to the 
discOToy or charactedzatkn of i«t)to^Bcogaies. sudi as 
: . o^/^ ^ o-mjKc. that aroiOocated ad^ccat to the 
transtocat!on.»'» It is aow widdy imdentood that cancer- 
associatttltraastocations diniipt ncaiby |mrt»^ 

A disti|K?t sobtype of acute leukemia b diamcteiized ^ 
the triad of B-lineage immunophenotype^ codnophilia, and 
the t(5;14)(q3!;q32) translocation.^ Leukemic cells &om 
.such patients have been positivcfor terminal dcoxyhuclcotid^ 
transferase (Tdt), common acute lymphoblastic leukemia 
antigen (CALLA), and CD19, but n<igathre for surface or 
^^pla^mic hnmuno^obulin. In previous work, we cloned 
the t(5;l4) brcafcpohit horn one leakemk sample (Case 1) 
and determined tliat the IgH and intwleukin-3 (11^3) genes 
were joined by this abnormality/ In this report, we extend 
those findings by showing that the t(5;l4)(q31;q32) translo- 
cauon from a second leukemia sample (Case 2) has a similar 
stmcture, and we report bur study of growth factor expres- 
sion in these patients. 

I^ATHWALS AND METHODS 
Samples and Southtm blots. Case I has been described.^ 
Clinical features of Case Time been i^eactlbcd m detalL^ J^U/i 
iioladon and Southern blotthig was done u^ previously d^eribed 
mcthodf ^ Rlters were hybridized with an hnmune^tflnilhi Jh probe, 
a 280 bp B(^m/BcoRl gemmiie 11^3 iragmeat, and an H^3 
^NA probe." 

Northembhts. RNA isolation and Northern Wotttag have been 
tocribed,'' BrieQy, Northern btots weie done by n^tbg 9/ig 
total RNA on 1% agaroae-foraialddiyde gels. Equal RN A loading in 
^ch bine was confirmed by ethidlum bromide staining. Blots were 
hybridized with an DU3 cDNA probe extending to the I site in 
exon 5, a 720 bp Sst l/Kpn 1 probe derived from introo 2 of the IL-3 
gen^ a 600bpiVAe l/Hpa 1 11^5 cDNA pn>ho. and a 500 bp Pst 
Vwo I gtanulocyte-macrophage colony stimulating factor (OM* 
CSF) cDNAprobe.*^'* 

Polymerase chain reactioru Primers wete designed with 3flmHI 
sit w for ckMdng. One prhner hybridized to the Jh sequences horn the 
IgH gene (Primer 144:5'.TAGOATCCGACGOTOACCAaOG'n 
and Ae other hybridized to the region the TATA box m the IL-3 
g«e(Primer 161: S^AACACKIATCCCGCCITATATOIOCAO). 
PWymerase chain reaction (PCR) (95K: for I mfamt^ 61*0 for 30 
seconds, and 1T>C for 3 mhmtes) was done ustag 500 ng genomic 
W A and 50 pmtA of each primer hi 100 contahdng 67 mmol/L 
TOs-HCl pH 8.8, 6.7 mmol/L MgC^ 10% dhnetbyl solfcjdde 
(DMSO)» 170 HB/mL bovhie serum albumin (BSA) (fraction Y)» 



ecthflty. There was no evidenes of exeeaa granulocyte/ 

maoropbage colony sdmulathig factor (GAM^) or Wb 

e>9ressiom Our data support the formiihithm that this 

w»type of leaikemla may aHsa hi part because of a 

. chrnm o snm s trnna1 o cat l oo .4hat..acihwtes^s-ilrS uBn9^^ 
resuWnjtoj^^^ 

^l99Oby7hsAmeHe9ttS0cf9tyofHomatok)gy. 

16.6 nunol/L animonium sulfate, 1^ nmtol/L eadi dNTP and Tan 
polymerase (Perid».Bhner, Norwalk, CT)J^ 

■yegii CTrfng. Sequencing was done by chafaitennhiatian fa M13 
vectoii." A# part of this study, we sequi^ 

-1240 (with respecl'td the proposed sitS^oftianscArtiaffhS^ 
to an iVfte I site at position - 64Z The plasmid oontai^ 
wasa^fromNaitoAndoftheDNAXRcseaichlWtntaL 

BxptesjfonlnCo37 fells. AgenoniieIL.3fiagmcat&omCtol 
was ^oMd into the p5^ cipressiott veitw » Briefhr, the HInm/ 
^fragment centahdng the n.3 gene was sabdoned Item the 
geviouabr .described phage done 4 hito pUClg,' The 16 kb 
fifdgment extending from the Sma I site 61 ^ upstieam of the IL-3 
transcripdonstarttothe55r»a laltehittopeJyH^ 
the bhmted Jtto I rite of pXM. The nbgaSe contJo^^ 
the pXM vectw without Insert Plaamids wjtie httiodneed faito Cos7 
cells by dectroporationt and snpeniatant Was edlected after 48 
hours In culture. ^ 

TFlbioassay. TF-1 cells were passaged hi RFMI 1640 supple, 
mented with 10% hcat-hiactivatcd fetal beriae seium, 2 mmol 
L-ghitanunc,and 1 ng/mL human OM-CSF-" Samples and antibody 
ies were dUutcd in this same medium OM-CSF but coatah>- 
ing penidlUn and streptomydn. A 25 ^L Vi^ume of serial diiutionB of 
patient serum was added to wdb hi a flat bottom 96-wdl microtiter 
plate. Rat antin^toldne mmodimal antibody hi a vohune of 25 lO, 
wasad<Wtoappw^tewdbiuidpfefaHwbat^ 
Rftymierditmoftirfcew^ 
..gi^g^a final ^fiqi^^ ^ 10« cells per wdl.ffinS 

.VKton^ 100 fii. The i^te was faicttbated for 48 honii The 
remahdng ceU viability was detemifaed metaboUcal^ by the eolori- 
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MEEKER ETAL- 

B^j^f Mosmann tsSag a VMm rnktodter plate ttate 
(»Wee^ Dcvfce^ MoJo PaiiE,.CA) »i at 576ai»d«» 

Q<oWm lrm>wi««M|,>A Hkm. uasy, lued lat i«aod«ial 
aiiti<5**fa» antflKxUcs (10 Mg/mL) to CMt th6 wdb <^ftPVC 
captimu«ttlho4lM und to» BVD3^>. 
JBS1.39Di0t and BV1>2.23B«, fiir the IL-3, and bM<33F 
amjn. leqwctiydy. Itetfeaf mm «et»tlteii added- (imdihrted and 
1^ A«™ WHBhited and dOited 
lor UM-CSF). Tbo detecting iiiiiuaiMvageota u«d «er* dtfcer 
m«j»aiiMsen»nto 

ST'^.f^* JESI-5A2 aad BVI».21CI1, spedfie for. 
a^5Md OM^, r«pectlw|y. BipndMitlbody.in. salsequiSiitly 
JWertedwftfamnaopero^ 

S?^^r^,^^^*'°*^''«*»>''3,crHltS^ (J4 
MoAfe) anttNIP for IL.5 „a Ok^GSF. Ibe duomogenic sub- 
sttirte vm 3-ra2iiio-bii.1)on2t]iiaM]iiwtnironato (ABTS; Sigma. St 
Louis, MO). Unknown valnea wen iateipolated'fiom standanl 
ouvc? prepared Snm dihtioiH of tlw xeteakMnaiit IMan vOag 
SSll^S^r^ WW. tl« VMAX -icroplate reader 

RESULTS 

Leukemic DNA from Caw 2 was studied by Southern 
blottog. When digested with the fltedm lestdctlon tazyaut 
am hybridized with a human ImmvnoglolNilin heavy chain 
joining r^on (Jh) probe, nieainuiged fragment at appioii- 
, matdy 14 kb was defected (data.flot 8h«»ra). When r&robed 
with dther of two diffoent IL-S probes,a rearranged 14 1* 

«2 #1 



ftagment,oomlgratingwiththetearrang « 

Jto^Rl,areatt«^ed.Jhfh.gmeat,SSSd 
Ttto ^b» also MeatiliM a oomigraSffi^ 

X^stTrS^STa^tS' 

Tocharactciiw better tho joining of 
~^f»i«vy chai 

l^J^ to done the tiandocatfo^ 

tton^hlie««.»lDNAgaveaopi^^^ 
3«Jted a PCR^erivcd Ixagnient of aJJSSTJsd^ 
wWA was doned and sequenced. *R 

c^Sl^lJ:'"^"^^ of the translocation clone from Case 2 
Weaisodetennineath^aSSvTN^lfi^^ 
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0:5 Kb 

semi Intaet. Boxee denwo Ae iw «,onsi rMtrlctlon emynwa are (B) lltoM. WfSlw iSICff^S^S^SSfSStT*'^^'^ 
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m locauons of th© two cloned brealiK^ts in relation to the 
IL-3 grae» Tlio two duomosomo S breakpdnts woe sena. 
ratedbylesstbanSOObp. 

Til© goaondc stnictare in Ctecs 1 and 2 suggested that a 
iKirmaI IL-3 gene product was over-expressed as a result of 
tte altered promoter structure. This would predict tliat the 
IL-3 gene on the translocated itoomosome was capable of 
mldng IL-3 protein. Thia prediction was tested expnsB- 
»ng a genomic fragment from the translocated aUele of Case 
I containing aU five IL-3exon5 under the control of theSV40 
promotor/enhancer in the Cos? ceU line. Cdl supematants 
were studied in a proliferation assay usinfe the factor dcpen- 
dM^y^itoktado ceU line. TF-L Hie supematants 
deifved from transfections usmg the vector plus msert 
suppoxtedmi proliferation, whflesupema^ts from trans- 
fwtions using the vector alone were negative in this assay 

(datanot5hoTO)-I^ainore.thelHol^caclMtycoddbe 
blocked by an antibody to human IL-3 (BVD3-eo$X This 
result showed that the translocated aUele retained the ability 
to make IL-3 mRNA and protein. 

The level of expression of IL-3 mRNA in leukmie cells 
from Case 1 was assessed. Northern blotting showed that the 
mature IL.3 mRNA (approximately 1 kb) and a 2.9 kb 
unspliccd 11^3 mRNA were excessivdy produced by the 
leukenda (Fig 3), The 2S kb form of the mRNA is also 
present at low levels in normal peripheral blood T lympho- 
cytes after mitogen activation (Fig 3). Several B-iineage 
acute leukemia samples without the t(5;14) translocation 
had UDdetectable levels of IL.3 mRNA m these experiments. 
In addition, altiiough genes for GM^Fand 11^5 map ckise 
to the IL-3 gene and might have bera deregulated by tiie 
translocation, no IL-5 or GM-CSF mRNA could be detected 
in the lenkemic sample (data not shown).*^ 

Three serum samples from Case 2 were assayed by 
immunoassay for levels of IL-3, GM-CSF. and 11^5 (Table 
1). Senun IL-3 could be detected and correlated witii tiie 
cUnical course. When tiie patients leukemic cell burden was 



highest, tiie IL-3 level was highest No serum GM-CSF or 
11^5 could be detected. -v^ror 
Since tiie 11^3 immunoassay measuied only immunoreac- 
tive factor, we confimed tiiat biologically active IL-3 was 
present by using tiie TF-l bioassay. This bioassay can be 
rradwed monospedfie using apiHopriate neutrd^ 
clonal antibodies specific for IL.3, 11^5, or GM-CSF We 
observed tiiat sera from 1-16-84 and 3-14-84 contained TF-I 
stimulating activity tiiat could be blocked with anti-DL.3 
MoAb (BVD3-6G8). but not witii MoAbs to IL-5 (JESl- 
39D10) or GM-CSF (BVD2-23B(>) (Fig 4; GM-CSF data 
not shown). The amount of neutraiizable bioactivity in tiiese 
two samples corrdated v«y wefl witii tiie diflference in IL-3 
levels obtained by hnmun(iassay for tiiese samples. Furtiier- 
naore. tiie faflure to block TF-l proliferating activity witii 
eitljer antnn^S^or anti-G^^CSR^ c^msistent witii tiie 
inabihty to incisure tiiestTFactoS^ "Lni^^ 

Table 1, Pariphml Blooil Count* and Growth Factor UveU 
at Different Tlmo« in Case 2 



SamptoOata 



Peifphenri blood counts (celts/iiL) 
WBC 

Lymphob!asts 
Eosinophils 
Serum growth factor tevtfto (pg/mL) 

gnm:sf 
tt.-e 



11/18/83 t/ie/e4 3/14/B4 



81.800 
0 

46,026 

' <444 
<1B 
<60 



116,500 
33,785 
73,080 

7,896 
<16 
<60 



12.300 
0 
616 

1.051 
<1B 
<60 



Peripheral Woodoowts 
theeonew 

patient received Ghemoiherapy between 1/16/84 and 3/1 4/84 to lower 

*''^!*!?*"**^ ^^'^^^ "">n*" were avaitable for a simQar 

anafysb off Case 1. • 

AbbrevtaOon: WBa wMte Wood ealb. 
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DISCUSSION 

In this report, wo have maOei oar analysis «f acute 
tymphocytic leukemia and eodnophilia associated with the 
K5;14) translocatifMi: In both cases we have stndied. We have 
doOTmMted the joining of the n^Sgeneftom chromosomes 
to tbo IgH gene fimm chromosome 14. The brealcpoints on 
dteomMome S are within 500 bp of each other, suggesting 
t^addinonal breakpoints wiU be clustered in a small region 
ofthen^Spromotor.ThePCRassaywehavedewlqpedwiB 
be usrful in the screenhig of additional cUnical samples for 
this abnormality. . 

The finding of a disrupted IL-i piomotor associated with 
m oQtenrm normal IL-S gene implied that this txansloca- 
i^inig^ lead to theoverrespression of a nocinal IL.3 gene 
produrt. in thb work, we have documented that this is true. 

Jiadd,tibn.neitherGM^FnorfiL.5a,«over^xpressedby 
the leukemic cells. Ftarthermore. in one patient, senun IL-3 
could be measured and correlated with disease activi^. To 
our knowledge, this is the first measurement of human IL-3 
in serum and its assodaaon with a disease process. The 



may now be indicated. 
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^immarand Pathologrd significance oHfie 
c-efb&-2 (HEfi-2/iieii) Oncogene 

Timothy P: singleton andJohn a SWcWer 



Tlic oncogene was first shown to hav« clinicd significance in 1987 by 

Slamon et al»79 reported that c-eriB-Z DUA amplification in breast caicino-. 
mas correlated with decreased swrvlval in patients with metaalasis to ariUary 
lymph nodes. Subsequent studies, however, of c-er6B-2 activation in breast 
carcinoma reached conflicting conclusions about its clinical significance. This 
oncogene also has been reported to have clinical and pathologjc implications in 
other neoplasms. Our review summsHzes these various studies and examines 
the clinical relevance of c^B-2 activation, which has not been emphasissed in 
recent reviowB.»^»?>.B Ihe molecular biology of the c^6B-2 oncogene has been 
extensively rovlewed^ww and will be discussed only briefly here. 

BACKGROUND 

The o-6r6B-2 oncogene was discovered In the 1980s by three lines of faivestiga- 
tion. The neu (mcogene was detected as a mutated transforming gene in 
neuroblastomas induced by efiiylnitrosxirea treatment of fetal misAAHw qij^ ^ 
erbB-2 was a human gene discovered by its homology to the tetroviral gene v- 
er&B,»M».tB was isolated by screening a human genomic DNA library for 

homology with v-erbB.^ When the DNA sequences were detennined subse- 
quenUy, c-erfcB-2, HER-i, and neu were found to represent the same gene. 
Recently, Uie c-er^B-2 oncogene also has been referred to as NGL, 

The c-erbB-2 DNA Is located on human chromosome ITqaiM-w^a and codes 
for mBNA (4,6 W)). which tranabtes o-0rbB-2 protein (plB5). Ihis 
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protein is a normal component of cytoplasmic membranes. Hie c-erfcB-2 
oncogene is hqmologmis with, but not Identical to, o-«r&fi-l, wbich is located 
on chroxnosotne 7 and codes br cj^donnal growth fector xeccptorA^^Tlie c* 
erbB'2 protein is a receptor on cell membranes and has intracellnW tyrosine 
kinase activity and an esctraceUolar binding domain.^^ Electron micxoscopy 
w ith a polvdonal antflwdy detects o^h^2 imihttnoreactivity on ny tnplMmte 
membranes of neoplasms, espedatty on mlcrovllii and the non-vilbns outer cdl ' 7 
membrane*^ In nonna} ooDs, inununoldtetodiemlcal zeaotivily Cbr 6^'rBB-2 b 
irequenfly present at the basohteral membrane or &it cytophsniio memlniuie's 
bnish border.^^^ 

iheire is experimental evidence that protein may be involved la ' 

the pathogonesis of breast neoplasia. Overproduction dT otherwise normal ^> 

protein j^n transform a ceU line into a malignant phenotype*<> Also^ ) 
yiiktn the neu dUcbgene c^ntai^l^aji activatl2i|( point mutatioh is placed in * . 
transgenic mice with a strong promoter for increascfd expresston, the mice 
devdop multiple independent mammary adenocarcinomas.^^ In other eycperl- 
mentsi monoclonal antibodies ag^nst tfie neu protein iiihibit the gmwfb (in 
nude mice) of a neu^transformed cell line.^^ and .inununization of mioo with 
neu protein protects them ftom ^hsequent tumor ehallenge wlA the neth . 
transformed cell line.^^. Some authors have speculated diat the use of antago- 
nists for the unknown ligand could bo usofiil in fiiture chemotihexapy.B^ EHirlher 
review of £his experimental evidence is beyond the scope oF this article. 

The o:er2^B*2 activation most likely occurs at an early stage oF neoplastic 
development. This hypo&esis is supported by the presence of c^lrB*2 active* 
tion in both in situ and invasive breast carcinomas. In addition, studies oF 
metastatic breast carcinomas usuaQy demonstrate uniform c-eriB-2 activation 
at multiple sites in the same patIent,"'W^*«^ althou^ c-6r&B-2 activation has 
rarely been detected in metastatic lesions but not in die primary tumor;*'*^*" 
Even more rarely, o-er2iB»2 DNA ampliBcation has been detected hi a primary 
breast carcinoma but not in its lymph node m^tastasia.^ In patients who have 
bjlaieralbreastneoplasms» bothlesfonshavesimi 
tion, but only a few such*pases have been studiedf^ . 

MECHANISMS OF 0^bB^2 ACTIVATION 

The most common meuhanism of c-erbB-2 activation is genomic DNA ampliilca* 
tion, which abnbst always results in oveiproductlon of c-0r&B-2 mANA and 
protein. Hie c-erbB-Z ampllBcation may stabilize die overproduction \i 

. mKNA or' protein through unknown mechanisms. Human breast carcinomas 
with c-6r&B-2 amplification contain 2 to 40 times more o-erbB-2 DNA<« and 4 to 
128 times more c-ffriB-2 mBNA^«» than found in normal tissue. Most human 
breast carcinomas with o-er&6*2 ampllficatk)n have 2 to IS times more o^bB-2 
DNA, Hunors with greater amplification tend to have greater oveiproduc- 
tion.^^'^^ The non^mammary neoplasms that have been studied tend to have 
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simllwr levels of MrbB-E amplification or overproduction rekttve to the eoire- 
spondlng normal tissue. 

HiB second most common mechanism dPc^erbB-S activation is overpiDdue- 
tton of e-eriB-2 mKNA and protein viFithout ampjification of crerb^2 DNA,« 
Ine quanttties of mRNA and protein usually are less than those in amplified 

—oases^d^swy^pioaAAi^maa^^ • 

.ti«ue5;n»W31io^iB-2protelncverproduction^amut 
tioo flfr DNA amplification has been described in a fisw human breast caidnoma 
oeUJine^i^ 

Other rare medianism^ of o^&B-S activation have been reported, 
^tions mvolving theo-erbB-agenohavebeendssra&edinafewmaB^ 
jpstric carcinoma although some reported cases may i^piesent iestric?tion 
fragment length polymotphisms .or insoMlete resfrlction ta^a digestions 
that mimic trans!ocations.M.^^auu« XaSglB pohit mutation in the traSmem^ 
brene poztioa of neu has been desordied in^t neuroblastomas taduoed by 
^yhdtro8uiea;Ml3io mutated tieu protein has Incieased tyrosine kinase activ- 
ity and aggregates at die cell membrana'^n** AlthOu^ tiiere has been specula^ 
tion that some of the amplified f^erbB-Z ^nes may contain point mutatioos,^ 
none Kaa been detected in primary human iieopla9ms:''i^-sw 

TECHNIQUES FOU DETECTINQ MrJbB-2 ACTIVAtlON 
Deteotfon of DNA Ampltfioaflon 

AmpBflcation of e-erfiM DNA Is usuially detected by DNA dot blot or South- 
em blot hybridiaation. In the dot blot metiiod, the extracted DNA is placed 
diiectiy on a nylon membrane and hybridized with a c-6ffeB-2 DNA probe. In 
the Southern blot method, the extracted DNA is treated witfi a restriction 
enzyme* and the firagments are separated by electrophoresis, transfemd to a 
nylon membrane^ and hybridized witii a o^r6B-2 DNA pnAo. In both twh- 
nlquea, o-erbB-2f ampUflcation is quantified by comparing the intensity (ukea- 
S4red by dei;isit©metiy) of the hybridizatifm bands firom ^ aample wiA diose 
fix)m control tissue. ' . ' 

Several technical problems m^ complicate die measurement of c-er&B-2 
DNA amplification. First, the extracted tumor DNA may be excessively de- 
,^ graded or diluted by DNA firom stromal cells,« Second, the o-efiB-2 DNA 
probe must bo carefully chosen and labeled. For example, oligonucleotide o- 
ar probes may not be sensitive enou^ fi>r measuring a low level cSc^bB- 
2 amplification, because diploid copy numbers can be difficult to detect (unpub* 
lished data). Third, the totol amounts of DNA in the sample and control tissue 
must be compensated fon often with a probe to an unampUfied gene. Many 
studies have used control probes to genes on chromosome 17. tiie location of ©• 
er6B-2» to correct hr possible alterations in chromosome numbet Identical 
results, however, are obtained by using control probes to genes on other chn>- 
mosomes,*-"'" with rare exception, Studies using control probes to the beta- 
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^obb gene must be Jnteipreted with cautkm, because one aDele of thJs gene is 

deteted occasionally Ip breast carcinomas.* 'j* 

AmpMcatJon of DNA was assessed by using' the polymerase * 

chain reaction (PCB).in one recent study » OUgoprimers fo^ &e o^etbB^Z gene \ 

and a control gen e are added to the samples DNA» and PCR Is perfbnned. If j 

Uab SM ide ^utai i ii mo^e copied ef <sertB > a DNA dbaa cI M ie coa t wl gene Tflia > 

c^2»B-2DNAisrepIicatBdpreferentiaI]y. ' 

Detection Of e-ef»B-2mRNAOverprodiK4^^ ^ 

Ovenm>dnctlon of o^fi-2 mBNA usually is measured by RNA dot blot or ' 

Nwrtiiem Wot hybridiaatlon. ^oth teehniquet leqoinr exlractiim oTBNAbut 

^erwise are analogous to DNA dot Uot and Southern blot hyhcfdizaUon. Use ^ 

^PC»fordtetBctioriofcPW*B-2mHNA-has5l»o^ 5 
wskncts.^^ 
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Overproduction of o-ffr6B-2 mBNA can be measured ^ 
tion, Sections are mounted on glass slides, treated with protean hybridized 
with a T^olabeled probe, washed; treated with nuclease to remove unbound 
probe, and developed for autondiogn^hy. Sihrer grains are seen onty over 
tumor cells Oiat overproduce o-flr6B-2 mRNA. Negative contirf probes are 
used K^«MWi Our ei^erience indicates that these techniques are ndabvely insensl- ' 
ttve for detecting o^f&B-2. mBNA oveiproduction in routine^ processed tis^ 
sue. Although die sensitivity maybe increased by modlScations djat allow. 
slmuhmieotis detection of o^6B-2 BNA and mBNA, in situ bybridizalion sUH | • 

is cumbersome and expensive (unpublished data). 1 • 

All of the above c.er&B-2 mRNA detection techniqpies have several prob- t ! 

lems that make them more difficult to perform than techniques for detecting 
DNA ampUficaUon, One in^or problem is die rapid degradation of BNA in t 
tissue that is not Immediately frozen or fixed. In addition, during die detection |:\ 
procedure, »NA can be degraded by RNase; a ubiquitous enayme, which must I ' 

bo eliminated meticulousV from Uboratoi7 solutions- TWn^ 
genes ttiat are uniformly escpressed in die tissue of interest need to be oaiefoJly 
selected. • j . r 
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Detection of c~0rbB-2 Protein OverproducUon 

The most aoouxate methods for detectfaig t>erbB-2 protein wreiproduction are | ) 

the Western blot method and inimunopredpitatioa. Both techniques can docu- f ' 

meat the binding spedfldty of varlbua antibodies against e-erfrB.2 protefa. In I i 

Western blot studies, protein is extracted from the tissue, separated by electro- ^ 
phoresis (aocordtagtosize), tranafenedtoamembiane.anddetectedbyusIngan- f 
Ubodies to c-«r&B-2. In iounmopreoipltation studies, antibodies against c-er6B- ^ 
2 are added to a tumor lysate, and the resulting protein-antibody precipitate is * 
separated by gel electrophoresis and stained for protehi. Botii Western bbt end 
hmnunoprecipitation m useful research tools but currently are not practical for- 
diagnostlc pathology. Two recent abstracts have described an enzyme-linked 
immunosorbent ass^ CEUSA) fin- detection of o^rfcB-2 protein.*^ 
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Overproduction of o-6r2rB-2 protein is most cqmrnonly assessed by various 
immuiiobistocheinical techniques. These procedures, often generate conflicting 
results, which are explained at least partially by Ihrcc Firsts vaiions 

studies have used, different polyclonal and monodonal antibodies. Because 
so me po lyclonal antibodies recognize we^.bands in addition to tte c^&B-2 
-piulelu baud uu Wi^luniHblot^rinranmopred^ . 
studies should be interpreted Wldi caution.^^^'^ Even some monodonal antl' 
bodies iiDinimiqpieci^ protein bands in additioik'to c-er{»B-2 (plSS),^^^ 
Seeo)id« tissue fetation oontributes to variability between stndies, Fbr asample, 
som^ antibodies detect c-0r&B-2 protdn only in frozeia tissue and do lujt react 
in fixed tissue* In general, formalin fixation diminishes Oe sensitivity of 
immunohistochemlcal metfiods and decteases the number of teactlve ceDs**^ 
When BouinVfisttive is used, :0iere mayj>e ^)^^^fii^pe]t9^ of positive 
casea,*^ Tliird, minimd criteria fegr inteipietl^ 

are generally lacking. AUboug^ there is general agreement that distinct crisp 
cytoplasmic membrane staining is dlagnostia 6>t o-erfrB*2 activation in breast 

. cardnoma. die number of positive c»lb tind the staining Intensity required to 
diagnose o*ef6B-2 protein overproduction varies firom 5tu<3fy to study and fiom 
antibody to antibody. DegraidatiQn of c^bB-2 protein is not a problem because 
it can be detected in intact form more than 24 hours after tumor resection . 

' without fixation or fine^ezing.^ 

ACTIVATION OF o-erilB^2 IN BREAST LESIONS 
Incidence of e-erbB-2 Activation 

Most- studies of c-er&B-2 oncogene activatidn do not specify histologica] sub- 
types of infiltrating breast carcinoma. AmplificaUon of o-«r&B>2 DNA was found 
in 19.1 percent (519 of 2715) of invpsive carcinomas in 25 studies (M)le 1), and 
c^bB^Z mRNA or protein overproduction was detected in 20*9 percent (M6 of 
2714) of invaslv e carcinomas In 20 studies. IWehre studies have documented e- 
-^rbbiS mRNA or protein overproduction ih 16 percent (88 of 604) of caitfaiomas 
" that lacked c-erbB-2 DNA ampliScadon. 

The incidence of e-erkB^l ai^ivation in infiltmting breast candnoma varies 
with the histological subtype. Approximately 22 percent (142 of 650) of infiltrat- 
ing ductal carcinomas have oerbB'2 activation^ as expected from the above 
data. Other variants of breast carcinoma with frequent c*er&B-2 activation are 
inflammatory carcinoma (62 percent, 54 of 87), Paget s disease (82 percent, 9 of 
11), and medullary carcinoma (22 percent* S of 23). In contrast, c-0rbB-2 activa* 
tion is infrequent in infillrating lobular carcinoma percent, 5 of 73) and 
tubular carcinoma (7 percent, 1 of IS). 

The c-er&B*2 protein overproduction is present in 44 percent (44 of 100) of 
ductal carcinomas in situ and especially comedocardnoma in situ (68 percent, 
4S of 72)» The micropapillaxy type of ductal cardnoma in situ also tends to have 
c-srbB-2 activatlon>«^Ha especially If larger cells are present. The greater fre- 
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quency of c-ei*B-2 protein oveiproductloii in comedocarolaoina ia silu. com- 
pared witii inflltniUiig ductal carcinoina. could be explained by &b iiict that 
many inffltrating ductd carcinomas arise fiom other types oflntraductal card- 
noina, wlildi shonr o<f*B-a activBtion infiequeiilly. Others have speculated 
th a t carcinoma ig dh> with <ygr&B.g acti vation tends to regress or to lose c- 
gJ/D-a wUvadon durmg ptogresdon to invasiim.*ww InfiHraBng aoS WiSSS" 
wmqjonents of ductal carduoma, however. nsnaOy are dziiilar iriih respect to ^ 
activ8Han,«Wi OHumjjk some sudibrs hove noted more heteregeneify of 
tamiinohislochemksd staining patt^ in invashre than in to situ carci- 
noma.**" Activation ofc^fcB-B 1» infrequent in lobular carcinoma in situ. If 
testons contain more than «» histailogloal pattern cfcaniaoma in situ, the <^ 
er6B.2 prot6in ovwproduction tends, to occiir in the comed^o^cbionia ia sito 
but n^y indnda other areas of«ircInoroa in.sltu.«w.« OverprojjHcBon otc- - 
lerra-a prciteia to ductal carcinoma in situ correlatei wiA laige^ cell size and a 
periductal lymphoid taffllratB.* 

■ Activation of c-er&M has not been identified hi benign breast lesions, 
mdn^ fihraeystlo disease, fibroadftnomas. and radial scars (Ibble 2). Strong 
membrane famnunohistodiemlcalreectiviiy for o-wfcB-2 hasnotbeen described 
In atypical ductal hyperplasia, alUiough weakaccentuatiDoof medihrBQestaintos 
has been noted Infireqnettay.3<iAM i„ ju,^ breast tlssui^ OHrrftB.2 EWAto 
dipKrid. and c.erhB-2isatpre5sedat lower levels dian to activated lnmors.«Aasta 

^fsopreliintoaiy data suggest 6iato^l>Br2 activation may not be usda 
torresdving many of the commonproblems to diagnostic rorgical pathology. ]for 

example. c.ertB.2 activation iainfirequentto tubularoaitstooma and radial scara^ 
In addition, because o:erbB-2 activation is unusual in atypicalduotalhyperplasia, 
«liwifonn carcinoma to situ, andpaplllaiycardiioihafasltu, detection of fr^rfcB- 
2 activation in tiieselesiom may not be helpfulin dielrdHrerenttal diagnosis. The 
histological featuresofcomedocarctoomatasltii, wWdjcomnwnly overproduces 
c-erbh-Z. are unUcely to be mistaken for those of benign lesions. Activation^ 
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c^r&B-2, however, does fevor infiltrating ductal cardnoma over infiHraUne 
lobular carcinoma, f^er studies of diese israos wtiuld be usofiiL 

Kf*'!"^ AtthnrBon WHh PaUitftoflle PrognA$ile Faetors 

M ^tipio studies have attempted to conelate acttvatkm with various 

pkO iJugte jprognostto factors fl^le^ ). AcHvation ot.»<»fmrM c wieEier- 
with lympb nodo melastasfs to » of 28 series, wldi bigber histological grade in 6 
of 17 series, and widi higher staBe m 4 of 14 series. Large size was not 
wodated with c-ethB-i acttvation in most studies (11 of 1^. TfetrOTbld DNA 
content aqd low proHferation, measured by Ki-W, hove been snggestfed as 
^gnostic fectors and may conelate witii'o«rdB-2 actiVBticHi.*^' * 

ComlaUon of o-eribBi-2 ActivatliBB WHh CIlnleal^PrognoafloMM 
vMous studies have attnnptedalso to comOato e^B-a activatton wfth diaical 
features tiiat may predict a poor ouicome CKhJe 4). Activation of ceriB-S 
correlated with absence of estrogen recepton In 10 of 28 series and with ab- 
sence of progesterone receptoisin 6 dflS series, bi most studies, patient age 
did not comhite wiai c^bB>2 activation, and. in the rest of the reports, e- 
«r6B-2 aetlvatlen was assocfated with either younger or older ages. 

Correlaftlon of ^^6*2 Aethratlon With PaUenl Outcom6 

Slamon at aP^« fink showed that ampUBcation of the CH»iB.2 oncogene inde- 
pendently predicts decreased survival of patiento with breast cardnoma. The 
correlation of e-erhB-2 amplification with poor outcome was nearly as strong as 
the correlation of number of invdhred lymph nodes with poor outcome. Slamon 
et al ako reported that e.ertB-2 ampUficaUon is an important prognostic indica^ 
tor only In patteots with lymph node raeiastasis."tu 

A large number of subsequent studies also attempted to oonelate c-eriB^ 
aeavadon wiA prognosis (Mile E). In 12 series. Aere was a correlation be- 
tween e-er£>B-2 activation and tumor leoonenoe or decreased survlvaL In five 
of diese s^es, Om piediqtiye vahu oStyerb^ acthntion was lepmted to be 
taidqpendent of dOter prognostie fiictorsrlo oontrast, 18 seidw did no^confinn 
flie ootndation of D-erbB-2 activation with recurrence or sutvfvBl. Pbur posrible 
explanations for this controversy are discussed below. 

One problem is that c-erfeB-a anplificatton .corrdates with prognosis 
mafaity hi patients with lymph node metastasis. As summarized in "Rble 5, most 
studies of patimts with axillary lymph liode metastasis showed a oorrelatlon of 
G-er6B-2 acavation with poor outcome. In contrast, most studies of patients 
without miliary metastasis have not demonstrated a conelation wiUl patient 
outcome. Tible 6 summarises the ^dies in which all patients (with and widi- 
out axillary metastasis) were considered as one group. There is a tr«nd for 
studies with a higher percentage of metastatic cases to show an association 
between e-«r6B-2 activation and poor outcome. Thus, most of tite eunent 
evidence suggests that o-erhB-Z activation has piopMMtlo vahie only in patients 
>with metastasis to lymph nodes. 
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to^ltT ^ T ^ °f caicfaoma are grouped 

^^fc^JS^ ■ T*" ^ usual nua,S„y «^ 

noma, but ft Is aa uncommoD IcsIod. 

A Aird potential problem Is the paudty of studies attetnnt ♦« 

Zl^^' ^"^^ J" P^f^ate ^*out lymph nSe 

metaj^ who had various risk fecto» for recurrence (such as largeSTor S 

^^T'^^STJr::!^^'^' r^^^ overexpress.o«^d,3^ 
recurrence a.w In patients with ductal caidnoma in situ, one small study found 

no assodalion between tumor recurrence and ««rfcB.EacttvaS« 

A fourth problem is the lade of data regarding whether ihe wojaiosis 

«»rrelat«, better with «,rfcB-3 DNA ampMciion or With iSuSl 

av«P«duction. Most stiidies that find a wrndation betweJ^riSs^ 
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Hon and poo, patient outcome measnie i«»r6B4 DNA amollflcatihn rt>.ki «v 
ovetproduction. and many studies rfis«*l».9 „^->:_ * mww 



-rSk .^^fT^^r^^ describe EGFB in breast caidnodta. AM 
T»w genes «Hjr6Aand ear-l QiolHmudbBnis to rt» Avwrf.! 

r.n5r«r «»I''?f*?»°' «»d *«e tumors have a decreased survival tE 
J^^ST ^* "-^^^ ""^ -"^^ ampMcation.^ S^uttiy 

Other genes also have been compared with iserhn.Q. n^Mti-M^ i 
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ACTWATION OP e-erfcBi-2 IN NON-MAMMARY TISSUES 

Inddeheo of Acuvauon InNon-MammaryTlssues 

Table 7 summtolzes the normal tissues In vdiich oerhB-2 expression has been 
d etected, usu aBy with fanm upoliistedieaiical medwds nsbig pdydmial antt- 
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bo4es. Only a few studies liave been performed, and some of these do not 
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High-throughput technologies, such as proteomic screening and DNA micro-arTays, produce vast 
amounts of data requiring comprehensive analytical methods to decipher the biologically relevant 
results. One approach would be to manually search the biomedical literature; however, this would be 
an arduous task. We developed an automated literature-mining tool, termed MedGene, which 
comprehensively summarizes and estimates the relative strengths of all htiman gene-disease 
relationships in Medline. Using MedGene. we analyzed a novel mlcro^rray expression dataset. 
comparing breast carKer and normal breast tissue in the context of existing lenowledge. We found no 
correlation t>etween the strength of the literature association and the magnitude of the difference in 
expression level when considering changes as high as S-fold; however, a significant correlalion was 
observed (r = 0.41; p = 0.05) among genes showing an expression difference of 10-fold or more. 
Imeresilngly. this only held true for estrogen receptor (ER) positive tumors, not ER negative. MedGene 
identified a set of relatively understudied, yet highly expressed genes in ER negative tumors worthy of 
further examination. 

Keywords: bioinfoTmatics • micro-array ».texi mining * gene-disease association • tM-easi cancer 



Introduction 

Al its current pace. Uic accuinulatJon of biomedical literature 
outpaces the ability of most researchers and clinicians to stay 
abreast of Uiclr own inuncdinie fields, let alone cover a broader 
range of topics. For example, to follow a single disease, e.g.. 
breast cancer, a rescarclier would iiave had to scan 130 different 
journals ami read 27 papcf3 per day in 1999.* This problem U 
acccnluatcd wiUi high'Uu-oug!)pul t€^cllr>ologic5 such as DNA 
micro-arrays and protconiics. which require the analysis of 
large dalascts involving tlwusands of genes, inany of which are 
unfamiliar lo a particular researcher. In any micix)array cxpert- 
nicnt. thousands of genes may demonstrate slatlsticaUy sig- 
nificanl expression changes, but only a fraction of these iiwy 
be relevant lo the study. Tiic ability lo interpret Unsse dalascts 
would be enhanced if ihcy could be compared lo a compre- 
hensive summary of w)ui Is known about all genes. Thus, there 
is a need to summarize existing Icnowledgc in a formal that 
allows for the rapid analysis of associations bclween genes ar»d 
diseases or otlicr specific biological coiKcpts. 

One solution to this problem Is lo compile structured digiLil 
resources, sucli as the Breast Cancer Ccnc Database* ;ind the 
Tumor Gene Database.' However. ;is these resources are liand- 
ctirated. the labor-intensive review process becomes a ratc- 
Hiniting step in ll^e growth of iJw datiibase. As a result, these 
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databases have a limited scale and tltc genes arc not selected 
In a systematic fashion. 

An allemallve approach is automated text mining; a method 
which invohres automated infonnallon extraction by searching 
documents for text strings and anatyzir^ their frequeixy and 
context This approach has been used successfully in several 
instances for biological applkaUons. In most cases, it has been 
applied to extract information about the relationships or 
interactions that proteins or genes ]>av€ with one another, in 
tl)e literature or by funclional arunolaUon.'*^ Thus far. few 
publtcallon have applied text-mining to examine the global 
relationships between genes and diseases. Perez-IraUeia el al. 
automalJcaliy examined the CO (Ccnc Ontology) annotation 
of genes and Ihclr predicted chromosomal locations In order 
to identify genes linked to Inherited disorders.' 

To obtain a more global undeisUfulIng of disease develop- 
ment, it would be valuable to incorporate Information regarding 
all possible gene-disease relationships. liKludlng biochemical, 
physiological, pharmacological, epidemiological, as well as 
genetic This Information would enable comprelicnsivc con^ 
parlsons between large experimental dalascts and existing 
iuiowledge in the literature. This would accomplish two things. 
First, il would serve to valkiale experiments by demonstrating 
that kiK>wn responses occur as predicted. Second, it would 
rapidly highlight wlilch genes arc corroboraicd by the tlieraturc 
and whicli genes are novel in a given context Wc liavc utilized 
a compuladonal approach lo litcrattirc mining to produce a 
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comprehensive set of gene-disease relalionslilps. In addition^ 
we have developed a novc] approach to assess the strength or 
each association based on the frequency of cllatlon and co- 
dtatton. We applied this tool to help interprel the data Irom a 
large micm-array gene expression experiment coo^iaiing 
nonnal and cancerous breast tissue. 

Methods 

MedGcne Database. MedCene is a relational database, stor- 
ing disease and gene inrotmation from NCBI, text mining re- 
suits, sUUstical scores, and iiyperllnks to the primaiy lit- 
erature. McdGene tuts a web-based user interface for users to 
query the database (httpy/hip5eqjmediarvani,edu/MedGene/). 

Text Mining Algorithms. MeSH files were downloaded from 
the MeSH web site at ^aM (Nation Ubraiy of Medldnc) (http:// 
www.nlinjiih.gov/xnesh/meshhonichlnii) and human disease 
categories were selected LocusUnk files were downloaded from 
the LocusLink web site at NCBI (hUp://www.ncb].nlh.gov/ 
LocusLink/). Ofndal/preferred gene symbol, official/preferred 
gene name, and gene alternative symbols and names, all 
relevant armotations and URLs for each LocusUnk record, were 
colledcd. Gene search terms were used for literature searching 
and Included all qualified gene names, gene symbols, and gene 
family terms. Primary gene keys, predominantly qualiricd gene 
family terms and gene ofDdal/preferrcd symbols, were used 
to Index Medline records. If the ofndal/preferred gene symbols 
did not meet the standards to be an Index, then qualified gene 
orficial/prcfcrTed names were used. A local copy of Medline 
records (up to July, 2002) was pre-selcacd. 

A JAVA module examined the MeSH terms and tlien indexed 
each Medline record with the appropriate disease terms. A 
scparaiD JAVA module was uscil to examine Uie titles and 
abstracts for gene search icnns and then to index the genc> 
related Medline records with the relevant primaiy gene key(s). 

Statistical Methods. For evciy gene and disease pair, we 
counted records that were indexed for both gene and disease 
(double posiUve hits), for disease only (disease single hits), for 
gene only (gene single hits), and for neither gene nor disease 
(double negaOve hits) to generate a E x 2 conUngency table. 
On the basis of the conUngency tabtc-framework, we applied 
different statisUcal metfiods to estimate the slrcngUi of gene- 
disease rdatlonsMps and evaluated the results. These methods 
included chKsquare analysis. Plsher*s exact probabilities, rela- 
tive risk of gene, and relalhrc risk of disease" (http:// 
hipseq.mediiarvardjedu/McdCcne/). In addition, we computed 
the 'product of frequency', which Is tlie product of the 
proportion of disease/gene double hits to disease single hits 
and the proportion of dheasc/gcnc double hlls to gene single 
hits. To obtain a normal distribution, we tr;insformed all the 
stallsUcat scores using the natural logaritliin. We selected the 
log of the produa of frequency (LPF) to validate MedCene and 
to use for the analysis with tlic mtcro-anay data. Spcannan 
rank-correlation coefficients were used to assess tlic linear 
relaUonship between LPF and iiilcro-arTay fold change in 
expression level. 

Globsil Analysis. Diseases witli at least 50 related genes were 
selected for clustering analysis, and tlie LPF scores were 
nonnali?^ wiUi total score Tor each disease. I llcrTirchteal 
clustering was done with tlic "Cluster" software and the 
clustering result was visualized using "TreeVicvvcr' (http:// 
rana.lbl.gOv/I:isenSortware.htni). 
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Breast Tissue Micro-Arrays. Eighty-nine breast cancer 
samples (79% £R-posltlve) and 7 normal breast tissue samples 
were selected {rom the Harvard Breast SPORE frozen tissue 
repository and were n^rcsentatlve of the spectrum of histo- 
logical types, grades, and iionnone receptor immuno^piieno- 
types of breast carKer. Blotinylated cRNA. generated from the 
total RNA extracted from the buDc. tumor, was hybridized to 
Affymetrix U9SA oUgo-nudeotide micro-arrays. These micro- 
arrays consist of 1 2 400 probes, which represent approximately 
9000 genes. Raw expression values were obtained using CENE> 
CHIP software from Alfymctrix. and then further analyzed using 
the DNA-Chip Analyzer (dChlp) custom sofhvare. 

t 

Results 

Automated Indexing of Medline Records by Disease and 
Gene. To study the gene-disease associations In the Uteiature, 
we first compiled complete lists for human diseases and human 
genes. To Index all Medline records that were relevant to 
human diseases, the Medical Subject Heading (MeSH) index 
of Medline records was udlized. MeSH Is a controlled medical 
vocabulary from the National Ubraiy of Medicine and consIsU 
of a set of terms or subject headings that are arranged In both 
an alphabetic and an hierarchical structure. Medline records 
are reviewed manually and MeSH terms are added to each with 
software assistance.* " Twenty-three human disease category 
headings along with all of their child terms (see the Supporting 
Information. Supplemental Table I, or visit http://hlpseq. 
mediiarvard.edu/MedGene/publicatlon/sjrable liilml) were 
selected from the 200Z MeSH Index creating a list of 4033 
human diseases. 

No Index comparable to the MeSH Index exists for genes, 
and thus. It was necessary to apply a string search algoritlun 
for gene names or symbols found In Medline text A complete 
list of genes, gene names, gene symbols, and frequently used 
synonyms were collected from the LocusUrkk database at 
NCBI." » which contains 53 259 independent reconis keyed 
by an offRcial gene symbol or name (func 18<», 2002). For the 
purposes of this study, no distinction was made between genes 
and their gene products. Authors often use tlic same ruune for 
both, dlffercnliatlng the two only by the use of ItaBcs. if at aU. 
For the Intended use of Oils study, this lack of distinction Is 
unlikely to have a large effect and may In fact he t>enefidal. 

Initial attempts to search tiic literature using these lists 
revealed several sources of false positives and false negatives 
(Table 1). False positives primarily arose when the searched 
term had other meanings, whereas Ealse negatives arose from 
syntax discrepancies necessitating the development of fillers 
to reduce these errors. The syntax Issues were readily haixlled 
by Including alternate syntax forms In the search terms. The 
false posiUve cases, caused by duplicative and uruelated 
meanings for the terms, were more diflkult to manage. Where 
possible, case sensitive string mapping reduced Inappropriate 
citations. In many cases, however, this was not suflldcnt and 
the terms had to be eliminated entirely, thereby reducing the 
false positive rale but unavoidably under-representing some 
genes. 

For the purposes of data lmcl<lng. a primaiy gene key was 
selected to represent all synonyms tlial correspond lo each 
gene. Medline reconis were Indexed with a primary gene key 
when any synonym for that key was found in the title or 
abstract Case-insensitive string mapping was used for all 
searches except as rM>tc<l above. No additional weight was 
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source of error 

gene symbol/name 
b not unttpie 



gene ^rmbol Is 

unrelated abbreviaiton 
gene symbol/name 

has language meaning 
nonstandioYl syntax 
unoflicial gene name/^mbol 
noa^dDed gene name 



cfior type cnmplB 

false posiiive MAC-mydln 

afssodated glycoprotein 
A£4(7-maIl9iancy-assodated 
protein 

false posithre /^i4-palUd homologue (mous^. 

palUdin (abo abbrev. for PemuyNanIa) 
false posithre WAS-Wiskoii-Aldrlch Syndrome 

(also the ^vord ^was") 
false negative BAG^i Instead of BAG I 
false negative P53 Instead of TFS3 
false negative estrogen receptor Instead of 

Esirogien receptor 1 



fSlercohition 
eliminate this term 

ellmlnaie this terni 

case-sensitive string search 

add dash term 

add all gene nicknames 

add family stem tenn 



* In preliinlnasy studies. Medline w&s searched for cd-occumnce of genes nnd diseases and the f esidllng output was evaluated to Identify etror soiuccs that 
were amenable to (MmI llUess. Each error source Is categorized by ifae ^ ofenor Ucauses taUst poUUves are suggested relationships thai ate not real and 
rab« negaUves are real nlatSonshlps that are undcmpresenlcd. Ibe Bllcr solutions used aie Indicated. Nol« that In some cascs^ the fillsr sohiUon Usetf bitroduccs 
error. In guiaral. enor rates maiimlzed scnsllM|y« even at the expense of speclllcl^ If needed. 



added for multiple occutmioes of a term or the co-occurrence 
of multiple synonyms for the same gene k^. 

Medline records were searched with ^ qualified gene 
IdenUHcrs, such as the ofnclal/prefcrred gene symbol, the 
oOldal/prefcrred gene name, afl gene nickrtamcs and all syntax 
variants. In situations where there are several members of a 
gene family or splice vaiianls. some authors prefer to tssc a 
shortened gene femlly name, e.g., estrogen receptor Instead of 
GsLrogen receptor 1 (jE5Ri). creatii^ a source of false negatives. 
For this reason, gpne family stem tcims were created for all 
genes that have an alpha or numerical sulDx (e.g.. IL2RA, TCFfi, 
ESR3, etc.) and (hen used to search (he literature. The famUy 
siem terms were handled separately from (he specific gene 
names so that It would be clear when linkages were made to 
the gene family versus a speclOc member In (hat family. 

To Improve performance and accuracy, some pre>seIecUon 
was applied to the records that were scanned. First, review 
articles were eliminated to avoid redundant Ueatment of 
dtaiions. Second. notvEnglish journals were removed because 
the natural language filters were only relevant to English 
publications. Finally. Jounrmis unlikely to conuiln primary data 
about gene-disease relationships were also removed (e.g.. Int. 
J. Health Educ, Bedside Nurse, and / Healtij Eton). Together, 
these fillers reduced (he 12 198 221 MedUne publicaUons (July 
2002) by 37%. 

Ranking the Relative Strengths of Ccne-Diseasc Associa- 
tions. In total, there were 618 708 gene-disease co-citatlons, 
in which 16% (8297) of all studied genes had been associated 
to a disease and 9G% (3875) of aU diseases Iiad been associated 
to at least one gene. To rank the relative strengths of gene 
disease relationships, we tested several different statistical 
methods and examined the results. With tlie exccptton of the 
relative risk estimates, the methods provided similar results 
with respect to the rank order of the genc-disease association 
strengths. However, after comparing the results to otiier 
databases and aAcr consulting disease experts, the tog of the 
product of frequency (LPF) was selected for further analysis 
because It gave the best results overall. 

Validation of McdCcnc. In developing this tool, it was 
important to minimize the number of missed genes (false 
negatives) arKi miscalled genes (false positives). However. In 
situations when these goals were in conflict. Indusivcncss was 
priori Uzed. To determine Oie false negative rale in MedCenc. 
breast cancer was used as a lest case because It was associated 
with more genes than any oilier human disease and because 




Figure 1. Estimation of the false negative rale by comparison 
wilh haru^curated databases. The tareast cancer-related genes 
Identified by MedGene were compared with those listed In 
several other databases including the Tumor Gene Database 
(TGO).» the Breast Cancer Gene DaiabasefBCG),' GeneCards 
(GC)" and Swissprol.^* Genes were considered fabe negatives 
if they were represented In at least one of these other databases 
and not in MedGene and their llrA to breast cancer was stip- 
ported by at least one literature reference. All literature references 
were verified by manual review to confirm their validity. The 
number of genes in each datat>ase or shared by more than one 
database is indicated. The false negative rate was caloilaied by 
genes missed at MedGene (26)/total ruimber of r>onoverlapping 
genes in other databases (285). 

lliere were several pubtk: databases that link genes to breast 
cancer. We compared the Ibt of breast canccr-rekited genes 
from MedGene to these databases, illusiratcd In Figure 1. 
Among the 285 distinct tHcasI cancer-related genes (hat were 
supported by at least one literature citaUon in these hand- 
curalcd databases. 26 were absent from MedGene. suggesting 
a false negative rate of approximately 9%. To dctcmiine why 
these were missed, all literature references for tlicsc genes (80 
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papers) were reviewed manually (see the Supporting Inlbrma* 
tion. Supplemental Table 2. or visit ht4>://hSpseq.med. 
harvanUdu/MedCene/pubUcation/sjrable 2.htm0. AnuM^ 
Ihese papers, most false negatives were caused fay nonstandard 
gene terms or gene terms eliminated by our spedfidt/ filters. 
Few genes were missed because they were only mentioned In 
review papers {0A%) or th^ appeared only in the body of the 
manuscript but not the abstract or title (1.1%). Of notc» 
MedCene Identifled appnudmatiely 2000 additional breast 
cancer-related genes not listed In any other database. 

To assess the fidse positive error rate, two complementary 
approaches were used: a detailed analysis of one disease and 
a glotial examination of 1000 diseases. The detailed approach 
examined the false positive error rate and Its sources* whereas 
the global approach tested whether the overall results made 
biomedical sense. 

Using the LPF. H67 genes related to prostate cancer were 
assembled Jn rank order. We then retrieved approximately 300 
Medlino records each for the highest ranked 100 and the lowest 
ranked 200 genes and manually reviewed tlie titles and 
abstracts to determine the verily of tiie assoclatiork Nearty 80% 
of the highest ranked 100 genes fell Into one of the five 
categories that reflect meaningful gene-disease relationships 
(see the Supporting Information, Supplemental Table 3, or visit 
http://hIpscq.med.harvard.edu/MedCcnc/publlcation/ 
sjfable 3.html). Among the towcst ranked 200 genes, ap- 
proximately 70% reflected inie relationships. Of the 600 records 
reviewed, lliere were only two In which the association between 
the gene and the disease was described as negative. Both were 
genes with veiy low scores, in both cases. Uic authors did not 
argue the absence of any relationship, but rather that a 
particular feature of the gene or protein was not shown to be 
related to human prostate cancer." '* 

The coincidence of some gene symbols with medical ab- 
breviations, chemical abbreviations and biological abbrevia> 
lions resulted in most of the false positives (see the Supporting 
InformaUon. SupplemcnUl Table 4. or visit hltp-y/hipsc- 
qmed.harvard.edu/MedCene/pubUcaUon/sJTable 4JiUnl), ein- 
pliasizing the importarKC of the fillers that wen» added In the 
search algoritlun (Table 1). Without ihe niters, the false positive 
rale more than doubled, and the false negative rale rose 
drainaticaUy (daU not shown). For cxampki. among the papers 
about breast cancer, there were only 12 Medline records that 
referred lo ESRJ and 10 lo ESR2^ whereas almost 2000 papers 
mentioned estrogen receptor without specifying ESRl or ESR2, 
this latter group was delected by the family stem tcmi filler. 

To further validate these results, a global analysis of the gene- 
disease relationships described by McdCene was performed. 
For this experiment, it was reasoned lhat the more closely 
related Uio diseases are to one another, the more they will be 
related to ihe same gene sets. Thus, if the relationships defined 
by MedCcne accurately reflected the Uteraturc. Uien an unsu- 
pervised hierarchical clustering of the gene data should group 
diseases Jn a manner consistent with common medical think- 
ing. Conversely, if the clustered diseases do not make sense 
biologically or medically. It may reflect excessive false positives, 
false negatives, or inappropriate scoring of the data. 

To execute ihls experiment. Uie gene sets and the corre- 
sponding LPF values for 1000 randomly selected diseases (eacli 
with at least 50 gene relationships) were used as a datasel for 
cluslering the diseases. A review of ihc results showed that the 
resulting disease clusters vvere bideed logical based upon 
common medical knowledge (see the Supporting Infonnatlon. 
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Supplemental Figure 1. or visit ht^/Mpseq jnedJiarvaidedu/ 
MedCene/pubUcatlon/s^igure 1 JitmO. For example. In one 
such dustershown In Ftgm 2. diabetes and its complications 
grouped together and were also closely linked to diseases 
associated with slaivatlon states. . 

The number of genes associated wUh a gjven disease can. 
be estimated by adjustir^ the MedCene number up by the false 
negative rate (^9^ and down l>y the false positive rale (^26% 
on average). Using this, the average disease has 103.7 ± iS,% 
(mean ± sjd.) genes associated with It, although tiie range Is 
quite t»oad with 2359 genes related to breast cancer. 2122 
genes related to lung cancer and no genes related to a number 
of diseases. 

Applying McdGene to the Analysis of Large Datasets. Access 
to a comprehensive sunurtaiy of the genes linked to human 
diseases provided an opportunl^ to analyze data ol>lalned from 
a high-throughput experiment We compared tiie MedCene 
breast cancer gene list to a gene expression data set generated 
from a micro-array analysis comparing breast cancer and 
normal breast tissue samples. Micro-array analysis identified 
2286 genes llat had greater than a 1-fold difference in mean 
expressk>n level between breast cancer samples and normal 
breast samples. Using MedCene. we sorted the 2286 genes into 
four classes: 555 genes directly linked to breast cancer In the 
literature by ger^e term search (first-degree assodaUon by gene 
name): 328 genes directly linked by family term search ([Irst- 
degrcc association by family term): 1021 genes linked to breast 
cancer only through other breast cancer ger>es (second-degree 
association): and 505 genes not previously associated with 
breast cancer. (See the Supporting Information. Supplemental 
Figure 2, or visit http://hlpseq.med.harvard.edu/MedCene/ 
publication/sj^lgure 2.htinl.) Among iha 505 previously un- 
related genes, 487 were either newly IdcntiHed genes or genes 
Uiat had not previously been associated wilii any disease. 
Among the remalnbig 38 genes. 9 had been related to other 
cancers, speciilcatly esophageal coloa uterine, skin, and cervix. 

To determine whether the genes highlighted by the micro- 
array analysb were more likely to have l>eon previously linked 
lo breast cancer In the literature, we created a two-dimensional 
plot of the fold change of exprcssk>n level between breast 
cancer and normal tissue versus the literature score (LPF) 
(Figure 3A). There was a broad spread of expression changes 
among the genes dlrecUy linked to breast cancer ranging from 
less tiian I -fold change (68%) to over 40.fold (0.39£). Notably, 
the majority of genes with greater than lO-fold expression 
changes were linked to breast cancer by first-degree assoda- 
Uon. 

Among all 754 genes dlrecUy llnkcil to breast cancer In the 
literature, there was no correlation between LPF and micro- 
array fold change {r « 0.018, p-value = 0.62). However, when 
we stratified the analysb based on the magnitude of the fold 
change, we observed an increasing trend in correlation (Figure 
3B) suggesting that genes with a more sulkslantial change in 
expression level were more likely to have a stronger association 
in tl\c literature. For genes that had 10-fold change or more In 
expression level. Uic correlation Increased to 0.41 (p-value « 
0.05). 

When we evaluated the micro-array data separately for ER 
poslUve and ER negative tumors, the Uend In correlation 
between fold change and literature score was liiglily dependent 
on estrogen receptor status. Interestingly, ihere was a similar 
trend in correlaUon for ER poslUvc tumors, but no trend in 
correlation for ER negative tumors. 
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Figure 2. Global validation by clustering analysis. 2(A>. The gene sets and the corresponding LPF values for 1000 diseases, each with 
at least 50 gene relationships, were used in an unsupervised clustering of the diseases based on the gene patterns associated wlih 
them. A sample of the data Is shown here. 2(B). One of the resulting clusters Is shown that corresponds to blood sugar states. Diabetes 
terms (above the line) and starvation slates terms (under the line) clustered together. Within these group>s, there Is also clustering of 
diabetic small vessel complications, altered scrum chemlsules, nutritional disorders. etc.(Supplemeniaf Figure 1: http:/yhlpseq.med. 
harvard.edu/MedGer^publlcat)on/s_Figure 1 .himl). 



Finally, io validate our findings, we coinpuicd similar cor- 
relatioru between the bieasi cancer CKprcssion data and 
LPF scores generated by McdCcnc for hypertension, a 



disease unrelated to breast cancer. As expected, wc did not 
observe an increasing trend in correlaUon Tor hypcrtcn- 
sion. 
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Rgurc 3. Relationship between literature score and functional data for breast cancer. 3Ai The data from an expresston analysis of 
samples for breast tumors and normal breast tissue were analyzed to Indicate the fold difference of expression level between breast 
tumor and normal sample (cutoff > 3-fold change). The fold changes were plotted against the literature score for the same gene set. 
Green dots represent flrsi^egree association by gene search, blue dots represent first-degree assocbtion by faml^ search and red 
dots represent no association. Some well-studied ger*es. such as BRCA2 (pink circle), are not reflected by a substantial dHfcrcnce In 
expression level. Furthermore, the majority of genes that have no association with breast cancer In the literature had less than 10-fold 
expression changes (shaded area). 3B. The Spearman ranK-correlaUon coefficients between literature score (LPF) and the fold change 
of expression level between tumor and normal breast samples (>*axls) in relation to the amount of (old change of expression level 
(x-axis). Gene rank lists were generated for breast cancer (blue) and hypertension (pink). Correlations were also computed between 
the breast cancer gene LPF scores and fold change expression data among esuogen receptor poslUve tumors only (light blue) and 
estrogen receptor negative tuniors only (purple). 
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brcna neoplasms 


hypertanston 


rlieumalotd aftfailtts 
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estrogen receptor 
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PSBN2 


olOOnS 


Irli* 


KCM 


7 PI 


TP53 


BUK 
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JLo 
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CES3 


UlANrn 


cou^en 


l/lfLM 


B«k|l«mtM» InKlKlfnr 
SCUVaiOr UulWllOT 


CEACAMS 


SARi 


ILIA 


HTR2C 












vascular cell 




PIH 


ACR 


RELN 


adhesion molecule 


cydln 


CD59 


TNFRSFI2 


DBH 


ATOm 


COX5A 


ALB 


IL2 


MAOA 


vm 


cathepsin 


CYPilBZ 


CHi3Li 


COMT 


INS 


ERBB4 


MA72B 


IL8 


HTR2A 


ARC2 


TRAM 


angiotensin 
receptor 


interteukin 1 


SYNJI 


ABCAi 




matrix 






CCNDI 


AG7R2 


metalloprotelnase 


INPPI 


OLRI 


ECF 


NPPA 


Interferon 


NEDD4L 


coOagen 


MUCI 


LVM 


CD68 


FRAi3C 


MCP 
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Insulln-Uke 


DBH 


IL4 


ERBB2 


lipoprotein 


BC12 


NPY 


ILI7 


BAIAP3 


APQA2 










Intercellular 


mucin 


POMC 


h4MP3 


ATPiB3 


adhesion molecule 


FGF3 


neuropeptide 


SIL 


DRD5 


RAB27A 



' McdCcne results for the top 25 genes ^ssodaled with breast neofAfisms. bypcrtcnslon. rbeumalold aflbflUSk bipolar disorder, and alherosckroslSk reflectively, 
ranked fay IPP scores. The hyperlink io an the papers co-dtlng the gene arui the disease Is avadabW at McdCent website |http://hjpsci|.inediuimirdcdit/ 
McdCcne/). 



Discussion 

The Human Genome Project licralded a new era in blologlcat 
research where the emphasis on understanding specific path- 
ways has expanded (o global studies of genomic organization 
and biological systems. Hlgh-throughpul technologies can 
provide novel insight into comprehensive biological function 
but also Introduces new challenges. The utility of these 
tcdinologies Is limited to the abOlly to generate, analyze, and 
inlerprel large gene lists. MedGcne. a relational database 
derived by minirig the InroimaUon in Medline, was created to 
address this need. McdCene usefs can queiy for a ranfc-ordcred 
list of human gene-dlscase relationships fTable 2} for one or 
more diseases. Each entry is hyperllnkcd to the original papers 
supporting each association and to otiier relevant databases. 

MedCcnc is an innovative extension of previous text mining 
approadies. Percz-hatxcta ct al. used the GO annotation and 
their chromosomal locations to predla genes that may con- 
tribute to inherited disorders.* MedCcne takes a broader view 
and includes alt diseases and all possible gcne-disoase relation- 
ships. Furthennore. MedGene utilizes co-citalion to iridicate a 
relationship rather than CO annotation, which Is limited to the 
subset or genes ilial have CO aniK>tation. Our approach is 
complementary to tluit taken by Chaussabcl and Sher. vv1>o 
used the frequency of co-cited terms to cluster genes into a 
Werarchy of gene-gene rGlatlonshlps.^ 

A unique aspect of this tool Is the ability to assess the relative 
strengths of gcne-dLscasc relaUonshtps based on Ihc frequency 
of boUi co-cltauon and single cilaUon. This presupposes that 
most co-ctlaOoiis describe a positive assodaUon. often referred 
to as publication bias'* and is supported by our observations 



dial negalivc associations are rare (Supplemental Table 3: 
http://hipseqjned.harvard.edu/McdCcne/publication/5_Ta- 
ble 3Jilml). Of course, relationships established by frequency 
of co-dlatlon do not necessarily represent a true biological Link; 
however, it is strong evidence to support a true relationship. 

Another Important feature of MedGene is the ibiptemcnta- 
tion of sofhvare filters that substantially reduced the error raU», 
We estimate that less than 10% of all associations were missed 
and at least 70% of even the weakest associations were real. 
For this study, all of the Alters that we applied were general 
ones, e.g., expanding the list of ail gene names to address the 
dlfTerent syntax forms used by difTcrent journals, eliminating 
gene names that correspond to common English words, etc. 
The majority of the remaining search term ambiguities were 
idfosyncratic and dilDcult to identify systematically wllhout 
causing a significant rise In false negatives. Allemadve ap- 
proaches, such as the examination of the nearest neighbor 
tcims. need Co be considered to further reduce die false positive 
rale. 

It is not uncommon to see expression changes In mlao- 
array experiments as small as 2-fold reported in the literature. 
Even when these expression dianges are statistically significant, 
it Is not always dear if they are biologically meaningful. When 
comparing expression levels of disease to normal tissue, one 
expects an enrichment of known disease-related genes to 
appear In the altered expression grOup. MedGene provided a 
unique opportunity to test tills notion in the context of cxlsUng 
knowledge on a novel breast cancer niicro-array dataset. For 
genes displaying a S-fold change or less in tumors compared 
to normal, there was no evidence of a correlation between 
altered gene expression and a known role in the disease. This 
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Table J. Genes with Large Expression Changes in ER- bui 
Nd in ER+ Breast Tumors 



gene symbol 



fold 









610.8 


12 


89.4 


12 


69^ 


1.9 


59.6 


1.0 


383 


Z£ 


312 


1.0 


3a6 


Aja 


27.9 


3.6 


21.9 


4.7 


18.6 


1.0 • 


X4.6 


1.6 


14.4 


-1.0 


13.5 


4.2 


13.0 


4.4 


12.9 


-1.2 


J2J 


2.9 


X2.2 


IjO 


11.8 


4.0 


11.6 


-A3 


tl.l 


2.9 


ia9 


3.0 


ia2 


4.6 


10.2 


1.0 


lao 


-1.3 


-10.4 


-I.I 


-10.8 


1.3 


-11.4 


-4.1 


-15.7 


l.t 


-16.2 


-4.6 


-22J 


-1.1 


-36.8 


-2.6 


-51.5 


-1.4 


-643 


-1.0 


-83.1 


-1.6 


-85i) 




-150.3 



KRTHBi 
BRS3 
DKKI 
ZiCl 
TLRS 
KIAAOSSO 
CDKN3 
EB!2 

czm 

STKta 

MYOIO 
LADJ 
POLE2 
HhiC4 
BCL2LH 

mpa 

CCNB2 
CCNE2 
FCB 

lams 

HIF5 
SERPINH2 
YAPt 
LPHB 
TCEAZ 
TFFt 
COLI7AI 
P0F5 
BPACI 
PDZKl 
VECFC 
MUC6 
SERPINAS 
MBSJ 
CAI? 

«pr^«J gari« in ER ncpiUvc. bui noi ER po^ breast li^irs. AlhS 
0»t« «en« have cither never been eo-dtcd wlih brc!>^n^oX"yfl 
wcMt aooctoUon except those matltcd with nn * 

rcOccts the many gcntis xvhose role In breast cancer may not 
involve large changes In expression hi sporadic tumons (c r 
BRCA2 and BRCAZ^ and genes >vhose modest changes ^ 
expression may be unrelated to Ihc disease. StiiJdnely ainonE 
genes vnih a lO-fold change or more in expression level there 
was a strong and slgnincani correiaUon between expression 
levd and a published role in the disease, providing (he liist 
global vaiidaUon of the mlcro-array approach to Wentifylng 
disease-speoGc genes. 

Tl>e results derived from MedCcne have txvo ImpUcaUons 
First, a careful hunt for conroborating evidence of a role in 
breast cancer sliould precede any further sludy of genes ^vith 

x^h'Tn 7 \Tu ^"P^*^"*''" ^'^^^ ^»^Ses- Second, any genes 
with 10-fold changes or more are Ukely to be related to breast 
cancer and xvarrant altenUon. It Is Ulcely that this threshold wlU 
change depending on the disease as well as Uie experiment 
Interestingly, the observed corrdaUon was only found among 
ER-posiUvc tumors, not ER-negaUve. ThU may reOect a bias 
m the Uierature to study the more prevalent type of tumor in 
0.e population. Furthermore, this emphasizes that cauUon 
must be taken when intcrpreUng experiments that may conlain 
subpopulations that behave very dlfTercnUy. Tlie McdCenc 
^ipproDch Identincd a set of relaUvely understudied, yet lilnhlv 
expressed genes in ER-negatlve tumors Uiai are worthrof 
furdicr examination (Table 3). 
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In conduslon, we have developed an automated method of 
summarizing and or;ganizlng the vast btomedkalUterature, lb 
our knowledge the resulting database Is the most con^wtften- 
sh^e and accurate ofits Idnd. By genoatb^ a score that leOeds 
the strength of the assodatlan, ft provides an important tool 
for the rapid and flexible analysis of large datasets ftom various 
high-throughput screening experiments. Furtheraioie, it can 
be used for selecting subsets of genes for fimcUonal studies 
for bulldbig dlsease-q>ec}Sc arragrs. for looidng at genes com- 
mon to multiple diseases and various other hlgh*througbput 
applicadons. In the future, it wlU be possiUe to enhance the 
utflUy of the MedCene database by bulldlr^ Ifaiks between 
genes and other MeSH terms as weU as other biological 
processes and concepts, such as eeU dhrislon and responses to 
small molecules. 
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of the results, showing that the resulting disease clustere were' 
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results showing tiiat among die 505 previously unrelated genes, 
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Proteome analysis: Biological assay or data archive? 

In this review we examine the current state of proteome analysis. Tliei^ are 
three main issues discussed: why it is necessary to study P^^^^*^^""*^^^^^^^ 
leomes can be analyzed with current technology; and how proteome analysis 
can be used to enhance biological research. We conclude that proteome anal- 
ysis is an essential tool in the understanding of regulated biological sys ems. 
Current technology, while still mostly limited to the more abundant pro terns, 
enables the use of proteome analysis both to establish databases of proteins 
present, and to perform biological assays involving measurenrient of multiple 
variables We believe that the utility of proteome analysis in future biologica 
research will continue to be enhanced by further improvements m analytical 
technology. 
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1 Introduction 

A proteome has been defined as the protein complement 
expressed by the genome of an organism, or, in muUicel- 
lular organism.s, as the protein complement expressed by a 
tissue or differentiated cell lU. In the most common im- 
plementation of proteome analysis the proteins extracted 
from the cell or tissue analyzed are separated by high 
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resolution two-dimensional gel electrophoresis (2-DE), 
detected in the gel and identified by their ammo acid 
sequence. The ease, sensitivity and speed with which gel- 
separated proteins can be identified by the use of recently 
developed mass spectrometric techniques have dramati- 
cally increased the interest in proteome technology. One 
of the most attractive features of such analyses is that conri- 
plex biological systems can potentially be studied m their 
entirely, rather than as a multitude of individual compo- 
nents. This makes it far easier to uncover the many com- 
plex and often obscure, relationships between mature 
gene products in cells. Large-scale proteome characteriza- 
tion projects have been undertaken for a number of dif- 
ferent organisms and cell types. Microbial proteome pro- 
jects currently in progress include, for example: Saccharo- 
myces cerevisiae [2]; SalmoneUa enterica (3], Spiropiasma 
melliferum [4], Mycobacterium tuberculosis (S), Ochrobac- 
trum anthropi [61 Haemophilus influenzae 17), Synecho- 
cystis spp, 181. Escherichia colt 191, Rhizobium legummo- 
sarum 110), and Dictyostelium discoideum (111. Proteome 
projects underway for tissues, of more complex organ- 
isms include those for: human bladder squamous cell 
carcinomas (121, human liver (13], human plasma 113), 
human keratinocytes (12). human fibroblasts 1121, mouse 
kidney [12], and rat serum 114). In this manuscript we cri- 
tically assess the concept of proteome analysis and the 
technical feasibility of establishing complete proteome 
maps, and discuss ways in which proteome analysis and 
biological research intersect. 

2 Rationale for proteome analysis 
The dramatic growth in both the number of genome 
projecis and the speed with which genome sequences 
are being determined has generated huge amounts of 
sequence information, for some species even complete 
genomic sequences (115-171). The description of the 
state of a biological system by the quantitative measurc- 
• ment of system components has long been a primary 
objective in molecular biology. With recent technical 
advances including the development of differenlial dis- 
play-PCR 1181, cDNA microarray and DNA chip techno- 
logy 119» 201 and serial analysis of gene expression 
(SAGE) [2 1, 221, it is now feasible to establish global and 
quaniiiative mRNA expression maps of cells and tissues, 
in which the sequence of all ihe genes is known, al a 
speed and sensitivity v^hich is not matched by current 
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protein analysis technology. Given the long-standing 
paradigm in biology that DNA synthesizes RNA which 
synthesizes protein, and the ability to rapidly establish 
global, quantitative mRNA expression maps, the ques- 
tions which arise are why technically complex proteome 
projects should be undertaken and what specific types of 
information could be expected from proteome projects 
which cannot be obtained from genomic and transcript 
profiling projects. We see three main reasons for pro- 
teome analysis to become an essential component in the 
comprehensive analysis of biological systems, (i) Protein 
expression levels are not predictable from the mRNA 
expression levels, (ii) proteins are dynamically modified 
and processed in ways which are not necessarily 
apparent from the gene sequence, and (iii) proteomes 
are dynamic and reflect the state of a biological system. 



2.1 Correlation between mRNA and protein expression 
levels 

Interpretations of quantitative mRNA expression profiles 
frequently implicitly or explicitly assume that for specific 
genes the transcript levels are indicative of the levels of 
protein expression. As part of an ongoing study in our 
laboratory, we have determined the correlation of expres- 
sion at the mRNA and protein levels for a population of 
selected genes in the yeast Saccharomyces cerevisiae 
growing at mid-log phase (S. P. Gygi et al.y submitted for 
publication). mRNA expression levels were calculated 
from published SAGE frequency tables [22]. Protein 
expression levels were quantified by metabolic radiola- 
beling of the yeast proteins, liquid scintillation counting 
of the protein spots separated by high resolution 2-DE 
and mass spectrometric identification of the protein(s) 
migrating to each spot. The selected 80 samples consti- 
tute a relatively homogeneous group with respect to pre- 
dicted half-life and expression level of the protein pro- 
ducts. Thus far, we have found a general trend but no 
strong correlation between protein and transcript levels 
(Fig. 1). For some genes studied equivalent mRNA trans- 
cript levels translated into protein abundances which 
varied by more than 50-fold. Similarly, equivalent steady- 
state protein expression levels were maintained by trans- 
cript levels varying by as much as 40-fold (S. P. Gygi 
e( ai, submitted). These results suggests that even for a 
population of genes predicted to be relatively homoge- 
neous with respect to protein half-life and gene expres- 
sion, the protein levels cannot be accurately predicted 
from the level of the corresponding mRNA iranscripl. 

2.2 Proteins are dynamically modified and processed 

In the mature, biologically active form many proteins are 
post-translationally modified by glycosylation, phosphor- 
ylation, prenylation, acylation, ubiquitination or one or 
more of many other modifications {23] and many pro- 
teins are only functional if specifically associated or com- 
plexed with other molecules, including DNA, RNA, pro- 
teins imd organic and inorganic cofactors. Frequently, 
modifications are dynamic and reversible and may alter 
the precise ttuee-riimensional structure and the stale of 
activity ol a protein. Collectively, the stale of modifica- 
tion i>[ the proteins which constitute a biological system 
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Figure I. Correlation between nnRNA and protein levels in yea.st cells. 
For a selected population of 80 genes, protein levels were measured 
by "-S-radiolabeling and mRNA levels were calculated from publi- 
shed SAGE tables. Inset: expanded view of the low abundance region. 
Fof more experimental details, also sec Figs. 5 and 6, (S. P. Gygi et at.^ 
submiUed). 



are important indicators for the stale of- the system. The 
type of protein modification and the sites modified at a 
specific cellular state can usually not be determined 
from the gene sequence alone. 

2.3 Proteomes are dynamic and reflect the state of a 
biological system 

A single genome can give rise to many qualitatively and 
quantitatively different proteomes. Specific stages of the 
cell cycle and states of differentiation, responses to 
growth and nutrient conditions, temperature and stress, 
and pathological conditions represent cellular states 
which are characterized by significantly different pro- 
teomes. The proteome, in principle, also reflects events 
that are under translational and post-translational con- 
trol. It is therefore expected that proteomics will be able 
to provide the most precise and detailed molecular des- 
cription of the state of a cell or tissue, provided that the 
external conditions defining the state are carefully deter- 
mined. In answer to the question of whether the study 
of proteomes is necessary for the analysis of biomolec- 
ular systems, it is evident that the analysis of mature pro- 
tein products in cells is essential as there are numerous 
levels of control of protein synthesis, degradation, 
processing and modification, which are only apparent by 
direct protein analysis. 



3 Description and assessment of current proti^ome 
analysis technology 

3.1 Technical requirements of proteome technology 

In biological systems the level of expression as well as 
the states of modification, processing and macro-molec- 
ular association of proteins are controlled and modu- 
lated depending on Ihe state of the system. Comprehen- 
sive analysis of the identity, quantity and stale of modifi- 
caiion of proteins therefore requires Ihe detection and 
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quantitation of the proteins which constitute the system, 
and analysis of differentially processed forms. There are 
a number of inherent difficulties in protein analysis 
which complicate these tasks. First, proteins cannot be 
amplified. It is possible to produce large amounts of a 
particular protein by over-expression in specific cell sys- 
tems. However, since many proteins are dynamically 
post-lranslationally modified, they cannot be easily am- 
plified in the form in which they finally function in the 
biological system. It is frequently difficult to purify from 
the native source suflicient amounts of a protein for 
analysis. From a technological point of view this trans- 
lates into the need for high sensitivity analytical tech- 
niques. Second, many proteins are modified and pro- 
cessed post-translatipnally. Therefore, in addition to the 
protein identity, the structural basis for differentially 
modified isoforms also needs to be determined. The dis- 
tribution of a constant amount of protein over several 
differentially modified isoforms further reduces the 
amount of each species available for analysis. The com- 
plexity and dynamics of post-translational protein edit- 
ing thus significantly complicates proteome studies. 
Third, proteins vary dramatically with respect to their 
solubility in commonly used solvents. There are few, if 
any, solvent conditions in which all proteins are soluble 
and which are also compatible with protein analysis. This 
makes the development of protein purification methods 
particularly difficult since both protein purification and 
solubility have to be achieved under the same condi- 
tions. Detergents, in particular sodium dodecyl sulfate 
(SDS), are frequently added to aqueous solvents to 
maintain protein solubility. The compatibility with SDS 
is a big advantage of SDS polyacrylamide gel electro- 
phoresis (SDS-PAGE) over other protein separation 
techniques. Thus, SDS-PAGE and two-dimensional gel 
electrophoresis, which also uses SDS and other deter- 
gents, are the most general and preferred methods for 
the purification of small amounts of proteins, provided 
that activity does not necessarily need to be maintained. 
Lastly, the number of proteins in a given cell system is 
typically in the thousands. Any attempt to identify and 
categorize all of these must use methods which are as 
rapid as possible to allow completion of the project 
within a reasonable lime frame. Therefore, a successful, 
general proteomics technology requires high sensitivity, 
high throughput, the ability to differentiate differentially 
modified proteins, and the ability to quantitatively dis- 
play and analyze all the proteins present in a sample. 

3.2 2-D electrophoresis - mass speclromelry: a common 
implementation of proleome analysis 

The most common currently used implementation of 
proleome analysis technology is based on the separation 
of proteins by two-dimensional (lEF/SDS-PAGE) gel 
electrophoresis and their subsequent identification and 
analysis by mass spectrometry (MS) or tandem mass 
speclromelry (MS/MS). In 2-DE, proteins are first separ- 
ated by isoclcciric focusing (lEF) and then by SDS- 
IV\GE. in the second, perpendicular dimension. Separ- 
ated proteins are visualized at high sensitivity by staining 
or autoradiugraphV, producing two-dimensional arrays of 
proteins. 2-DE gels are, ai present, the most commonly 
used means of global display of proteins in complex 
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samples. The separation of thousands of proteins has 
been achieved in a single gel [24, 25) and differentially 
modified proteins are frequently separated. Due to the 
compatibility of 2-DE with high concentrations of deter- 
gents, protein denaturants and other additives promoting 
protein soliibility, the technique is widely used. 

The second step of this type of proteome analysis is the 
identification and analysis of separated proteins. Individ* 
ual proteins from polyacrylamide gels have traditionally 
been identified using //-terminal sequencing (26, 27), 
internal peptide sequencing (28, 29), immunoblotting or 
comigration with known proteins [30]. The recent dra- 
matic growth of large-scale genomic and expressed 
sequence lag (EST) sequence databases has resulted in a 
fundamental change in the way proteins are identified by 
their amino acid sequence. Rather than by the traditional 
methods described above, protein sequences are now fre- 
quently determined by correlating mass spectral or 
tandem mass spectral data of peptides derived from pro- 
teins, with the information contained in sequence data- 
bases (31-33). 

There are a number of alternative approaches to pro- 
teome analysis currently under development. There is 
considerable interest in developing a proteome analysis 
stragegy which bypasses 2-DE altogether, because it is 
considered a relatively slow and tedious process, and 
because of perceived difTiculties in extracting proteins 
from the gel matrix for analysis. However, 2-DE as a 
starting point for proteome analysis has many advan- 
tages compared to other techniques available today. The 
most significant strengths of the 2-DE-MS approach 
include the relatively uniform behavior of proteins in 
gels, the ability to quantify spots and the high resolution 
and simultaneous display of hundreds to thousands of 
proteins within a reasonable time frame. 

A schematic diagram of a typical procedure of the identi- 
fication of gel-separated proteins is shown in Fig. 2. Pro- 
tein spots detected in the gel are enzymatically or chemi- 
cally fragmented and the peptide fragments are isolated 
for analysis, as already indicated, most frequently by MS 
or MS/MS. There are numerous protocols for the gener- 
ation of peptide fragments from gel-separated proteins. 
They can be grouped into two categories, digestion in 
the gel slice |28. 34] or digestion after eleclrotransfer out 
of the gel onto a suitable membrane ([29, 35-37) and 
reviewed in [38)). In most instances either technique is 
applicable and yields good results, Tlie analysis of MS or 
MS/MS data is an important step in the whole process 
because MS instruments can generate an enormous 
amount of information which cannot easily be managed 
manually. Recently, a number of groups have developed 
software systems dedicated to the use of peptide MS 
and MS/MS spectra for the identification of proteins. 
Proteins are identified by correlating the information 
contained in the MS spectra of protein digests or 
MS/MS spectra of individual peptides with data con- 
tained in DNA or protein sequence databases. 

Tlie .systems we are currently using in our laboratory are 
based on the separation of the peptides contained in pro- 
tein digests by narrow bore or capillary liquid chromatog- 
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Ftfiure 2. Schematic diagram of a procedure for identification of gel- 
separaled proteins. Peptides can either be separated by a technique 
such as LC or CE, or infused as a mixture and sorted in the MS. Data- 
base searching can either be performed on peptide masses from an 
MS spectrum, peptide fragment masses from CID spectra of peptides, 
or a combination of both. 



raphy (39, 40J or capillary electrophoresis [41], the anal- 
ysis of the separated peptides by electrospray ioniza- 
tion (ESI) MS/MS, and the correlation of the generated 
peptide spectra with sequence databases using the 
SEQUEST program developed at the University of Wash- 
ington [32, 331. The system automatically performs the 
following operations: a particular peptide ion character- 
ized by its mass-to-charge ratio is selected in the MS out 
of all the peptide ions present in the system at a parti- 
cular time; the selected peptide ion is collided in a colli- 
sion cell with argon (collision-induced dissociation, 
CID) and the masses of the resulting fragment ions are 
determined in the second sector of the tandem MS; this 
experimentally determined CID spectrum is then corre- 
lated with the CID spectra predicted from all the pep- 
lides in a sequence database which have essentially the 
same mass as the peptide selected for CID; this correla- 
tion matches the isolated peptide with a sequence seg- 
ment in a database and thus identifies the protein from 
which the peptide was derived. There are a number of 
alternative programs which use peptide CID spectra for 
protein identification, but we use the SEQUEST system 
because it is currently the most highly automated pro- 
gram and has proven to be successful, versatile and 
robust. 

3.3 Protein identincation by LC-MS/MS. capillary 
LC-MS/MS and CE-MS/MS 

li has been dcmonsiraied repeatedly that MS has a very 
high intrinsic sensitivity. For the routine analysis of gel- 
separated proteins at high sensitivity, the most signif- 
icant chnltengc is the handling of small amounts of 
sample. The crux of the problem is the extraction and 
Iransferal ol peptide mixtures generated by the digestion 
of low nanogram amounts of protein, from gels into the 
MS/MS system without significant loss of sample or 
mtroduciion of unwanted contaminants. We employ 
three difTcrini systems for introducing gel-purificd sam- 
ples into an MS, depending on the level of sensitivity 



required. As an approximate guideline, for samples con- 
taining tens of picomoles of peptides, LC-MS/MS is 
most appropriate; for samples containing low picomole 
amounts to high femtomole amounts we use capillary 
LC-MS/MS; and for samples containing femtomoles or 
less» CE-MS/MS is the method of choice. 

3.3.1 LC-MS/MS 

The coupling of an MS to an HPLC system using a 
0.5 mm diameter or bigger reverse phase (RP) column 
has been described in detail |42J. This system has several 
advantages if a large number of samples are to be ana- 
lyzed and all are available in sufficient quantity. The 
LC-MS and database searching program can be run in a 
fully automated mode using an autosampler, thus maxi- 
mizing sample throughput and minimizing the need for 
operator interference. The relatively large column is 
tolerant of high levels of impurities from either gel prep* 
aration or sample matrix. Lastly, if configured with a 
flow-splitter and micro-sprayer [40], analyses can be per- 
formed on a small fraction of the sample (less than 5%) 
while the remainder of the sample is recovered in very 
pure solvents. This latter feature is particularly useful 
when an orthogonal technique is also used to analyze 
peptide fractions, such as scintillation of an introduced 
radiolabel, and this data can be correlated with peptides 
identified by CID spectra. 

3.3.2 Capillary LC-MS 

An increase of sensitivity of approximately tenfold can be 
achieved by using a capillary LC system with a 100 um ID 
column rather than a 0.5 mm ID column as referred to 
above. Since very low flow rates are required for such 
columns, most reports have used a precolumn flow split- 
ting system for producing solvent gradients. We have 
recently desribed the design and construction of a novel 
gradient mixing system which enables the formation 
of reproducible gradients at very low flow rates (low 
nL/min) without the need for How splitting (A. Ducret 
et fl/., submitted for publication). Using this capillary 
LC-MS/MS system we were able to identify gel-separat- 
ed proteins if low picomole to high femtomole amounts 
were loaded onto the gel [40J. This system is as yet not 
automated and, like all capillary LC systems, is prone to 
blockage of the columns by microparticulates when ana- 
lyzing gel-separated proteins. 

3.3.3 CE MS/MS 

The highest level of sensitivity for analyzing gel-sep- 
araled proteins can be achieved by using capillary elec- 
trophoresis - mass spectrometry (CE-MS). We have de- 
scribed in the past a solid-phase extraction capillary elec- 
trophoresis (SPE-CIZ) system which was used with triple 
quadrupole and ion trap ESl-MS/MS systems for the 
identification of proteins at the low femtomole to sub- 
femtomole sensitivity level (43, ^^\. While this system is 
highly sensitive, its operation is labor-intensive and its 
operation has not been aulomated. In order to devise an 
analytical system with both the sensitivity of a CE and 
the level of automation of LC, we have constructed 
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Figure J. Schematic itlustration of a 
microfabricated analytical system for CE, 
consisting of a mtcromachined device, 
coated capiliary elect roosmotic pump, 
and microclcctrospray interface. The 
dimensions of the channels and reservoir 
are as indicated in the text. The channels 
on the device were graphically enhanced 
to make them more visible. Reproduced 
from 1451, with permission. 



microfabricated devices for the introduction of samples 
into ESI-MS for high-sensitivity peptide analysis. 

The basic device is a piece of glass into which channels 
of 10-30 um in depth and 50-70 |im in diameter are 
etched by using photolithography/etching techniques 
similar to the ones used in the semiconductor industry. 
(A simple device is shown in Fig, 3). The channels are 
connected to an external high voltage power supply [45]. 
Samples are manipulated on the device and off the 
device to the MS by applying difTerenl potentials to the 
reservoirs. This creates a solvent flow by electroosmotic 
pumping which can be redirected by changing the posi- 
tion of the electrode. Therefore, without the need for 
valves or gates and without any external pumping, the 
flow can be redirected by simply switching the position 
of the electrodes on the device. The direction and rate of 
the flow can be modulated by the size and the polarity 
of the electric field applied and also by the charge stale 
of the surface. 

The type of data generated by the system is illustrated in 
Fig. 4, which shows the mass spectrum of a peptide sample 
representing the tryptic digest of carbonic anhydrase at 
290 fmol/iiL. Each numbered peak indicates a peptide suc- 
cessfully identified as being derived from carbonic an- 



hydrase. Some of the unassigned signals may be chemical 
or peptide contaminants. The MS is programmed to auto- 
matically select each peak and subject the peptide to CID. 
The resulting CID spectra are then used to identify the 
protein by correlation with sequence databases. Therefore, 
this system allows us to concurrently apply a number of 
protein digests onto the device, to sequentially mobilize 
the samples, to automatically generate CID spectra of 
selected peptide ions and to search sequence databases 
for protein identification. These steps are performed auto- 
matically without the need for user input and proteins can 
be identified at very low femtomole level sensitivity at a 
rate of approximately one protein per 15 min. 

3.4 Assessment of 2-DE-MS proteonne technology 

Using a combination of the analytical techniques de- 
scribed above we have identified the 80 protein spots 
indicated in Fig. 5. The protein pattern was generated by 
separating a total of 40 microgram of protein contained 
in a total cell lysate of the yeast strain YPH499 by high 
resolution 2-DE and silver staining of the separated pro- 
teins. To estimate how far this type of proteome analysis 
can penetrate towards the identification of low abun- 
dance proteins, we have calculated the codon bias of the 
genes encoding the respective proteins. Codon bias is a 




Figure 4. MS spectrum of a Iryplic digest 
oT carbonic anhydrase using the microfa- 
bricated system shown in Fig. 3. 290 
fmol/pL of carbonic anhydrase iryplic 
digest was infused into a Pinnig.m I.CQ 
ion trap MS. Each peak was selected for 
CID. and those which were identified as 
containing peptides derived from car- 
bonic anhydrase arc numhercd. Repro- 
duced from 1451. wiih permissiun. 
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Figure 5. 2-DE separation of a lysate of yeasi cells, with identified proteins highlighted. The first dimension of separation was an IPG from 
pH 3-10, arid ihe second dimension was a 10%T SDS-PAGE gel. Proteins were visualized by silver siaining. Further details of experimental 
procedures are included in S. P. Gygi et al. (submitted). 
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calculated measure of the degree of redundancy of trip- 
let DNA codons used to produce each amino acid in a 
particular gene sequence. It. has been shown to be a 
useful indicator of the level of the protein product of a 
particular gene sequence preseni in a cell |46]. The gen- 
eral rule which applies is that the higher the value of the 
codon bias calculated for a gene, the more abundant the 
protein producl of that gene becomes. The calculated 
codon bias values corresponding to the proteins identi- 
fied in Fig. 5 are shown in Fig. 6b. Nearly alt of the pro- 
teins identified (> 95%) have codon bias values of > 0.2. 
indicating ihcy arc highly abundant in cells. In contrast, 
codon bias values calculated for the entire yeasi genome 
(Fig. 6a) show that the majority of proteins preseni in* 
the proleome have a codon bias of < U.2 and arc Ihus of 
low abundance. 

This finding is of considerable importance in our assess- 
ment of the current status of protcome analysis technol- 
ogy. Ii is clear thai even using highly sensitive analytical 
techniques, we are only able lo visuaIi^c and idenlify the 



more abundant proteins. Since many important regula- 
tory proteins are present only at low abundance, these 
would not be amenable to analysis using such tech- 
niques. This situation would be exacerbated in the anal- 
ysis of proteomes containing many more proteins than 
the approximately 6000 gene products present in yeast 
cells 1 161. In the analysis of, for example, the proteome 
of any human cells, there are potentially 50000-100000 
gene products (47]. Inherent limitations on the amount 
of protein that can be loaded on 2-DE, and the number 
of components that can be resolved, indicate that only 
the most highly abundant fraction of the many gene 
products could be successfully analyzed. One approach 
thai has been employed to circumvent these limitations 
is the use of very narrow range immobilized pH gradient 
strips for the first-dimension separation of 2-DE |481. 
Since only those proteins which focus within the narrow 
range will enter the second dimension of separation, a 
much higher sample loading within the desired range is 
possible- This, in turn, can lead lo the visualization and 
identification of less abundant proteins. 



186g 



p. A. Haynes rt at. 



Etectrophoresis 1998, 19, 1862-1871 



(A) 



^ 3000 n 



8 

2000 



E 



1000 



tn rj ^ 
o ^ o 



9 ^ «i n «o 

o o o o o o 



(B) 



c 
I 



6 



50 
40 - 
30 - 
20 
10 ' 




-1 «i 

o o 



«2 ov o 



Codon Bias 



Figure 6. Calculalcd codon bias values for ycasl proteins. (A) Dislribu- 
lion of calculated values for ihe entire yeast protcome. (B) Distribu- 
tion of calculated values for the subset of 80 identified proteins also 
shown in Figs. I and 5. Further details of experimental procedures are 
included in S. P. Gygi ei at, (submitted). 



4 Utility of proteome analysis for biological 
research 

For the success of proteomics as a mainstream approach 
to the analysis of. biological systems it is essential to 
deHne how proteome analysis and biological research 
projects intersect. Without a clear plan for the implemen- 
tation of proieome-type approaches into biological re- 
search projects ihe full impact of the technology can not 
be realized. The literature indicates that proteome anal- 
ysis is used both as a database/data archive, and as a bio- 
logical assay or biological research tool. 

4.1 The proteome as a database 

The use of proteomics as a database or data archive 
essentially entails an attempt to identify all the proteins 
in a cell or species and to annotate each protein with the 
known biological information that is relevant for each 
protein. The level of annotation can, of course, be exten- 
sive. n)e most common implementation of this idea is 
the separation of proteins by high resolution 2-DE, the 
identification of each detected protein spot and ihe 
annotation of the protein spots in a 2-DE gel database 
formal. Tliis approach is complicated by the fact that it is 
difTicuh to precisely define a proteome and to decide 
which proteome should be represented in the database. 
In contrast to the genome of n species, which is essen- 
tially static, the proteome is highly dynamic. Processes 
such as dilTercntiaiion. cell activation and disease can all 
significjintly change ihc proteome of a species. Tliis is 
illustrated in Tig. 7. Tlic figure shows two high-resolu- 



tion 2-DE maps of proteins isolated from rat serum. 
Fig. 7A is from the serum of normal rats, while Fig. 7B 
is from the serum of rats in acute-phase serum after 
prior treatment with an inflammation-causing agent (49). 
It is obvious that the protein patterns are significantly 
different in several areas, raising the question of exactly 
which proteome is being described. 

Therefore, a comprehensive proteome database of a spe- 
cies or cell type needs to contain all of the parameters 
which describe the state and the type of the cells from 
which the proteins were extracted as well as the software 
tools to search the database with queries which reflect 
the dynamics of biological systems. A comprehensive 
proteome database should be capable of quantitatively 
describing the fate of each protein if specific systems 
and pathways are activated in the cell. Specifically, the 
quantity, the degree of modification, the subcellular loca- 
tion and the nature of molecules specifically interacting 
with a protein as well as the rate of change of these 
variables should be described. Using these admittedly 
stringent criteria, there is currently no comlete proteome 
database. A number of such databases are, however, in 
the process of being constructed. The most advanced 
among them, in our opinion, are the yeast protein data- 
base YPD (5C] (accessible at http://www.ypd.com) and 
the human 2D-PAGE databases of the Danish Centre 
for Human Genome Research [12] (accessible at http:// 
biobase.dk/cgi-bin/celis). While neither can be con- 
sidered coniplete as not all of the potential gene pro- 
ducts are identified, both contain extensive annotation 
of supplemental information for many of the spots 
which are positively identified in reference samples. 

4.2 The proteome as a biological assay ' 

The use of proteome analysis as a biological assay or 
research tool represents an alternative approach to inte- 
grating biology with proteomics. To investigate the state 
of a system, samples are subjected to a specific proceess 
that allows the quantitative or qualitative measurement 
of some of the variables which describe the system. In 
typical biochemical assays one variable (e.g., enzyme 
activity) of a single component (e.g., a particular en- 
zyme) is measured. Using proteomics as an assay, mul- 
tiple variables (e.g., expression level, rale of synthesis, 
phosphorylation state, etc.) are measured concurrently 
on many (ideally all) of the proteins in a sample. The 
use of proteomics as an assay is a less far-reaching prop- 
osition than the construction of a comprehensive pro- 
teome database. It does, however, represent a pragmatic 
approach which can be adapted to investigate specific 
systems and pathways, as long as the interpretation of 
the results takes into account that with current technol- 
ogy not all of the variables which describe the system 
can be observed (see Section 3.4). 

A common implementation of proteome analysis as a 
biological assay is when a 2-Dt protein pattern gener- 
ated from the analysis of an experimental sample is 
compared to an array of reference patterns representing 
differenl stales of the system under investigation The 
stale of the experimental .sy.siem at Ihe time the sample 
w;is generaled is therefore determined by the quanlita- 
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live comparative analysis of hundreds to a few thousand 
proteins. Comparative analysis of the 2-DE patterns fur- 
ihernnore highlights quantitative and qualitative difTer- 
cnces in the protein profiles which correlate with the 
state of the system. For this type of analysis it is not 
essential that all the proteins are identified or even visu- 



alized» although the results become more informative as 
more proteins are compared. It is obvious, however, that 
the possibility to identify any protein deemed character- 
istic for a particular state dramatically enhances this 
approach by opening up new avenues for experimenta- 
tion. 





fiRu.'c ? High icsolutlon 2 UU map of" pfotcins isolated fiom lai scium with or without prior exposure to an inflam- 
n)at:on-c;iusing agcni. (A) normal rat seruin, (H) aculc-phase scrum froni rats which had previously been exposed to 
an innammatron-causiiip npeni. Tlie (irst dimension of separation is an IPG liotn pM 4-10. and the second dimcn- 
s.ir.n IS .1 17 l gradlcm SDS-I'AGC gel. Proteins were visualized hy slainine with :imido black Turther dciails 
f»f cxpciiinuncal pritccduies are included in |I4, 49). 

\ 
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Proteome analysis as a biological assay has been success- 
fully used in the field of toxicology, to characterize 
disease stales or to study differential activation of cells. 
The approach is limited^ of course, by the fact that only 
the visible protein spots are included in the assay, and it 
is welt known that a substantial but far from complete 
fraction of cellular proteins are detected if a total cell 
lysate is separated by 2-DB. Proteins may not be 
delected in 2-DE gels because they are not abundant 
enough to be visualized by the detection method used, 
because they do not migrate within the boundaries (size. 
pi) resolved by the gel, because they are not soluble 
under the conditions used, or for other reasons. 

A different way to use proteome analysis as a biological 
assay to defme the state of a biological system is to take 
advantage of the wealth of information contained in 
2-DE protein patterns. 2-DE is referred to as two-dimen- 
sional because of the electrophoretic mobility and the 
isoelectric points which define the position of each pro- 
tein in a 2-DE pattern. In addition to the two dimen- 
sions used to generate the protein patterns, a number of 
additional data dimensions are contained in the protein 
patterns. Some of these dimensions such as protein 
expression level, phosphorylation state, subcellular loca- 
tion, association with other proteins, rate of synthesis or 
degradation indicate the activity state of a protein or a 
biological system. Comparative analysis of 2-DE protein 
patterns representing different stales is therefore ideally 
suited for the detection, identification and analysis of 
suitable markers. Once again it must be emphasized that 
in this type of experiment only a fraction of the cellular 
proteins is analyzed. Since many regulatory proteins are 
of low abundance, this limitation is a concern, particu- 
larly in cases in which regulatory pathways are being 
investigated. 

5 Concluding remarks 

In this report we have addressed three main issues 
related to proteome analysis. First, we have discussed 
the rationale for studying proteomes. Second, we have 
assessed the technical feasibility of analyzing proteomes 
and described current proteome technology, and third, 
we have analyzed the utility of proteome analysis for bio- 
logical research. It is apparent that proteome analysis is 
an essential tool in the analysis of biological systems. 
The multi-level control of protein synthesis and degrada- 
tion in cells means that only the direct analysis of 
mature protein products can reveal their correct identi- 
ties, their relevant slate of modification and/or associa- 
tion and their amounls. Recently developed melhods 
have enabled the identification of proteins ai ever- 
increasing sensiliviiy levels and at a high level of auto- 
mation of the analylical processes. A number of lech- 
nicai challenges, however, remain. While it is currently 
possible to identify essentially any protein spots that can 
be visualized by common staining methods, it is ap- 
parenl that without prior enrichment only a relatively 
small and highly selected population of long-lived, 
highly expressed proteins is observed. There are many 
more proteins in a given cell which are not visualized by 
such methods. Frequently ii is the low abundance pro- 
teins that execute key regulatory functions. 
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We have outlined the two principal ways proteome anal- 
ysis is currently being used to intersect with biological 
research projects: the proteome as a database or data 
archive and proteome analysis as a biological assay. Both 
approaches have in common that at present they are con- 
ceptually and technically limited. Current proteome data- 
bases typically are limited to one cell type and one state 
of a cell and therefore do not account for the dynamics 
of biological systems. The use of proteome analysis as a 
biological assay can provide a wealth of information, but 
it is limited to the proteins detected and is therefore not 
truly proteome-wide. These limitations in proteomics are 
to a large extent a reflection of the fact that proteins in 
their fully processed form cannot easily be amplified and 
are therefore difTicult to isolate in amounts sufficient for 
analysis or experimentation. The fact that to date no 
complete proteome has been described further attests to 
these difficulties. With continued rapid progress in pro- 
tein analysis technology, however, we anticipate that the 
goal of complete proteome analysis will eventually 
become attainable. 
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from the National Science Foundation Science and Technol- 
ogy Center for Molecular Biotechnology and from the NIH. 
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Discordant Protein and mRNA Expression in 
Lung Adenocarcinomas'^ 
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The relationship between gene eX)>re88lon measured at 
the mRNA level and the conresponding protein level Is not 
well characterbed In human cancer, bi this study, we 
ciMnpared mRNA and protein expression for a cohort of 
genes In the same lung adenocarcinomas. The ^im- 
dance of 163 protein spots representing d8 Individual 
9ones was analyzed In 76 lung adenocarcinomas and nine 
non-neoplastlc lung tissues using two-dimensional poly- 
acrylamlde gel electrophoresis. Specific polypeptides 
wera Identified using matHx-assIsted laser desorption/ 
Ionization mass spectrometry. For the same 85 samples, 
mRNA levels were detemilned using oligonucleotide ml> 
<iroarrays, allowing a comparative analysts of mRNA and 
protein expression among Ae 185 protein spots. Twenty- 
eight of the 165 protein spots (17%) or 21 of 98 genes 
(214%) had a statistically significant correlation between 
protein and mRNA expression (r > 0«2445; p < 0.05); 
however, among all 165 proteins the correlation coeffi- 
cient values (r) ranged from -0^ to 0A42. Conflation 
coeffldent values were not related to protein abundance. 
Further, iio significant correlation between mRNA and 
protein e^qsresskm was found (r = -0.025) If the average 
levels of mRNA or protein among all samples were applied 
across the 165 protein spots (98 genes). The mRNA/ 
protein correlation coefficient also varied among pro- 
teins with multiple Isoforms, Indicating potentially sep- 
arate Isoform-speclflc mechanisms for the regulation of 
protein abundance. Among the 21 genes with a signifi- 
cant correlation between mRNA and protein, five genes 
differed significantly between st^ge I and stage III lung 
adenocarcinomas. Using a cjuantitathre analysis of mRNA 
and protein expression within the same lung adenocarci- 
nomas, we showed that only a subset of the proteins 
exhibited a significant correlation w«h mRNA abundance. 
Molecular & Cellular Proteomlcs 1:304^13, 200Z 



Lung cancer Is the leading cause of cancer death tor both 
men and women In the Unfted States. Adenocarcinomas of 
the lung comprise -A0% of all new cases of non-small cell 
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lung canoer and are now the most common Wstologlo typa 
Functional genomics, broadly defined as the ccwnprehensK/e 
analysis off genes and their products, have become a recent 



kmg adenocarcinomas has the potential to al d In the kfenflflca- 
tlon of Wgh risk patients wRh resectebte early stage limg cancer 
that may benefit from acfluvant thempy, as well as to Identify 
new therapeuUc tangeta In human King cancer, however, little Is 
cunemly understood negandhg the relatlon8h|> between gene 
expression as detennined by measuring mRNA levels and the 
comespondlng abundance of the protein products. 

A number of powerful techniques tor analysis of gene ex- 
pression have been used Including cfifferentlal display (2). 
serial anaiysb of gene expression (3), DMA mlcmarrays (4)! 
and piDteomlcs via two-dimensional polyacrylamlde gel elec-- 
irophoresls and mass spectrometry (5). Blolnfbmiatlcs tools 
have also been develc^ to help detennlna quantltatWe 
mRNA/lxoteIn expression profiles of aB types of cells and 
tissues (6) and now can be appHed to benign and malignant 
tumors. DNA mlcroan-ays (cDNA and ollgonucleotldo) permit 
the parallel assessment o1 thousands of genes and have been 
utilized In gene expression monitoring (7), pdymoiphlsm anal- 
ysis (8), and DNA sequendng (9), Recent studies have fo- 
cused on classification or IdentWcaUon of sut>group3 of lung 
tumors using DNA mlcroanrays.(10, 11). The use of mRNA 
expression patterns by themselves, however. Is hsufflclent for 
understanding the expression of protein products, as addi- 
tional post-transcnptlonal mechanisms, Including protein 
translation, post-translatlona! modification, and degradation, 
may Influence the level of a protein present In a given cell or 
tissue. ProteomIc analyses, a complementary technology to 
DNA mlcroanays for monitoring gone expression. Involves 
protein separation and quantitative assessment of protein 
spots using 2D'»^PAQE and protein Identification using mass 
spectrometry. By combining proteomic and transcriptional 
analyses of the same samples, however, It may be possible to 
understand the complex mechanisms Influencing protein ex- 
pression h human cancer. 

In this study, we detennlned mRNA and protein levels for 
165 proteins {98 genes) In 76 lung aderK)carclnomas and nine 



^ The abbreviations used are: 2D, two-dimensional; MALOI-MS. 
malrix-asslsled la$er desorptlon/lonlzatlon mass spectfometry. 
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Tabu I 

Cotpstathn coefflctet^ts of protein and mR^/A whers 
r*, corretetlon coefflcleni value > 0.2445; p < o.05. Valuds In boldface am 



only one spot was pmsenf on 2D gals 
8t9nlflcantatp<OJ05. 
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vow 


ns.ouu/i 1 


ANXA5 


0.2488 


19S9 
U9vD 


Lie A^tAtl 


FSMC 


0J2446 


Lin O^it^AQ 


LDHB 


04420 


1171 


r18.£415l9 


C0X11 


0^10 


lloO 


H8.1 81013 


PGAM1 


0.2028 




H8.74635 


DLO 


0.1985 


1i9o 


H&,83383 


AOE372 


0.1932 


0172 


K3.3069 


HSPA9B 


0.1872 


0777 


H9.978 


PDHB 


0.1855 


1249 


H8.228795 


GSTP1 


0.1773 


16d$ 


Hs.76136 


7XN 


0.1732 


120S 


H$.82314 


HPRT1 


0.1588 


1230 


Hs.279860 


TPT1 


0.1466 


0603 


Hs.181357 


lAMRI 


0.1463 


1358 


Hs.28914 


APfTf 


0.1399 


1410 


H8.S2113 


OUT 


0.1213 


1826 


Hs.1 12378 


UMS1 


0.1213 


0871 


Hs.250502 


CA8 


0.11^ 


0289 


. H6.82916 


CCT6A 


0.1106 


1143 


Hs.11465 


QSmp28 


0.0997 


1466 


Hs,11$$38 


NME1 


0.0932 


1698 


Hs.278503 


"RIQ 


0.0905 


1354 


K9.aS76l 


A7P5D 


0.0904 


1445 


Hs. 155485 


HIP2 


0.0843 




H9.177488 


APP 


0.0746 


UDUO 


H9. 182285 


KRT19 


0.0439 


IVf 1 


Hs.10842 


RAN 


0.0277 


vv91 


Hd.^7939 


CTSB 


0.0264 




H3.77274 


PLAU 


0.0248 


VOCQ 


r!Q,l 98248 


' B4QALT1 


0.0183 




H6.1247 


AP0A4 


0X)l7e 




H8.104143 


CLTA 


0.0123 




H8.51Z3 


$tD6*308 


0X)117 




Hll.1473 


QRP 


-0.0040 




Hs^74402 


HSPA1B 


-0.0071 


1414 


H8.77641 




0.0096 


0710 


Hs.97206 


HIP1 


-0.0114 


053a 


H9.170328 


MSN 


-0X)132 


0525 


H3.284255 


ALRP 


-0.0148 


0513 


Hs.70901 


PDIR 


-0.0289 


"1659 


H3.26e697 


HINT 


-0.0312 


1262 


Hs.7016 


RAB7 


-0.0362 


0180 


Hs.184411 


ALB 


-0.0470 


0948 


H6,2795 


LDHA 


-0.0549 


0502 


Hs.1 80532 


GPI 


-0.0575 


0152 


H9.75410 


HSPA5 


-0.0640 


1054 


H3.74276 


CUC1 


-0.0686 


0709 


H3.253495 


SFTPD 


-0.0936 


0887 


H8.78996 


PCNA 


-0.0982 


0165 


H8.180414 


HSPA8 


-0.1014 


1109 


Hs.75103 


YWHAZ 


-0.1018 


0137 


H8.5$4 


SSA2 


-0,1032 



Protein name 



14-3-3 cr 

AnnexlAlV 

lXi-1 protelnAMB)6 

Superaxklo dlsmutase (Cu-Zn) 

Qeloo^ 1 

Transformalion up-regulated nuclear protein 

Ferritin llsht chain 

AntoxInV 

26 8 proteaeome p28 

Uactate dehydrogenase H chain (LDH-B) 

00X11 

Phosf^tyoerate mutase 
Olhydrollpoamlde dehydjrooenBee precureor' 
Antknddant enzyme a6E31^ 
GRP76 

Pynivale dehydrogenase E1*p subunit precureor 
QtutathTone ^transferase pi (Q3T-p]) 
Thtoredoxin 

HQ phosphorflsosyltransleiase 
Translatlonatly controlled tumor protein (TCTP) 
LAMR . I i 

Adenine phosphortbosyl transferase 
dUTP pyrophosphatase (dirrpase) 
PInch-2 protein' 

Cartwnlc anhydrase-rel«ed protein; Syntaxin 
ChaperonlrHIke protein 

GlulalWone S-transferaso homolog (QST hdrndog) 
Nm23 (NDPKA) 
RUG (U32331) 

FIFO-type ATP synthase subunlt d 
Huntlngtln Interacting protein 2 frilP2) 
AmylolclB4A 
CytoKeratJn19 

QTP-blndIng nuctear protefn RAN(rC4} 
Cathepsln B 

UroWnase plasminogen acMvalor 
0 1,4'^alacto3y| timsf erase 
Apollpoprotein A4 (ApoA4) 
aathrln Dghi chain A 

Cylosote Inorganic pyrophosphatase 
Prepro9a8tttn*reIed8lng peptide 
Heat shock-induced protein 
ADP>ribosytatlon factor 1 
Huntmgtin interacting protein 1 (HtPl) 
Moesln^ 

Alkaline phosphate, placental 

Protein disulfide Isomemse-related protein 5 

Protein kinase 0 InhlbHor 

Rab 7 protein 

Albumin 

IjSCtate dehydrogenase>A Qi^HA) 



GRP78 

Nuolear chloride channel (RNCC protein) 
Pulmonaty eurfaotant protein D 
PCNA 

Heat shock oognate protein, 71 kOa 
14-3-3 (ft 
Ro/gs-A antigen 
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Tabi£ K conflhi/ed 


spot 


Unigene 


Gene name 


r* 


PrDtetn name 


0278 
1769 
00S9 
251 1 
1739 
1138 
2633 


Hs^li2 

H3.9614 

Ha.74335 

Ha.153179 

H3.1648a 

Hs.3019ei . 

Ha.77060 


TCP1 

NPM1 

HSPCB 

FABP5 

CALR 

QSTM4 

PSMB6 


-0.1237 
-0.1738 
-0.2049 
-0^109 
-0.'2344 
-0.243$ 
-0.2512 


T-comp«8x protafn 1, <r subuntt 

B23/humatrtn 

Hsp90 

E-PABP/FABP5 
CalreUcuHn 32 

Glutathione S-tranaroraae M4 (OST m4) 
Macropaln subuntt A 



non-neoplastte lung tissues. Protein leveJs were determined 
using quantitative 2D-PAQE analysis, and the separated pro- 
tein polyp^tldfis were fdemilled ushg matrtx-aeelstad laser' 
deaorpilorvaonteatlon mass fipectromelry (MALDI-MS). The 
coiTBBpondIng mRNA levels for the Wenttfted proteins wHhIn 
the same samples were detemilned; using oligonucleotide 
micmarrays. Con-elatlon analyses showed that protein abun- 
dance Is likely a refleotlon of the transcription for a subset of 
proteins, but tranalatlon and poat-translatlonal modifications 
also appear to Influence the expression levels of many Indi- 
vidual proteins In lun^ adenocarcinomas. t 

EXPEWMEMTAL PROCaJURES 

TTsst/os-FWy^even stage I and 19 stage 111 lung adenbcarclno- 
mas. as weH aa nine norwieoplastJo lung tissue aamptes. warn used 
for protein and mRNA analyses. Patient consent was obtained, and 
-the prelect was approved by the Institutional Review Boaid. All tis- 
sues were obtained after leseotton at the Unhremfty of Michigan 
HeaWvSyalem between May 1991 and July 1988. Tissues wereall 
snap-fhozen bi Bquld nitrogen and then stored at -80 "O. The patients 
Included 46 females and 30 males ranging In age from 40.9 to 84 6 
i!!^^ ^""^ ^""^ demonstmted a poslthJe 

SS^i^J* °!^: ^^^^ samfrfes were dassffled as bron- 
^tfKlartved. 14 were classffted as bronchoah/eolar. and one had 

5llr?^ eighteen tumor sanples were dasalfied as weQ cSffer- 
enUaJe^SS ware dasslfled as modemte. and 19 were claBslfled as 
poor^f (fiffeiBntlatodadenocard^^ Hematoxylln-stalned cryostat 
«««tons ^ (on), prepared ftrom the same tumor pieces to be utiteed 
Tor pmttf n and mRNA Isdsfion. weie evaluated by a patholootet and 
com^ with hematoxylin, and eosln-stalned sections made from 
h'JSS "^^^j^^ Spedmans were excluded from 
analysis If they showed unclear or mbced histology (e.a adenosqua- 
mousX WceHute^ metastatic or^n as 

oxlenslveiymphocytte Inflllratlon. 
.^^^gj^or If the patient had received litor clXtltorSy^^ 

OtJgonudeotiae Atray Hyb//dteaffon-The HuQaneFL oflgorajcleo. 
t de arrayB (Aflymetdx, Santa dam, C/^ containing 6800 genes were 
^ In this study. Total RNA was laotated from all samples using 
Trteol reagent (IrivKrogen). The resulting RNA was then subjected to 
^r^^f^i!?" "^""^ RN««sy spin columns (Qtegen). Pripaiallon 
of cRNA, hybridizatton. and scanrdng of the HuQenea arrays were 
perforrned according to the manufacturer-a protocol (Aflymetrix. 
Santa Clara, OA). Data analysis was performed using GeneChIp 4^ 
softwam. The gene expression profile of each tumor was nonnattzed 
to tiie median gene expression profile for the entire sample. DetaHs of 
data trimming and nonnalbatlon are described elsewhere (1 1) 

30~PAQE and QuBntitatlve Pwtein Ana^KSte-Tlssue for both pro- 
tein and rpRNA Isolation came from contiguous areas of each sample 
Protein separation using 2D-PAGE. sfiver staining, and digitization 



were perfomrted as described piBvloualy (12, 13). Our 2D-PAQE ays- 
^ ^'^^ ^ *«^>- Spot detect 
tarn software (Blolmaga Corp.. Ann Arbor. MI). The imegiBted mtm- 
sRy of each spot waa calculated as the meastwed (^oai darX 
^ r'-^^^^^^W^^OOOspcS^^ 

820 sjx>ta on the gd Of each sample were marched usiS 
m^h program wHh the same spots on a chosen ^wX- aallh 
each sample, 250 ubIqUiously expressed reference 8D^wereI»^ 
to adlustfbr variations 

^l^T"^^" ^ differed be- 

caiee of batch were connected after spot-size quamfficafloa 

Mass Spoctrom^Bnd2D Western atoWnfl-PrepamUve 2D oels 
were run using extracts from A549 hing adenocarcinoma ceOs (ob- 

^L'^T^'^^^ "'^'^ 1^^**°^ experimental condrtfons as 
the anaMlcal 2D gels, except 30% more protein was loaded. The 
resolved protein gels were sHver-stalned using successhre Incuba- 
tions m 0.02% sodium thiosulfate for 2 mln. 0.1% sfl\^XrSr4^ 
i:: !^^'' Tt^. fom^akiehyde plus 2% sodium carb^ for'S 

dgesllon teHowed by MAtDI-MS using a MALDI-TOF Vc«afle^rc 
spectrometer (Persepthe Blosystems, Rtimlngham. MA). The 

SLJlt'^"^^'??^ (University of CaFlfo^la. San FranctecS 
proef^.uosf.edu^ucsthtml3.2/rnsfit.htm^^ Some of the 
^^"^Hf ^ \^mM prior to thlsSX^^ 

Vie baste Of eequam^ (14). The Identlfted protein spots used Inthb 

SSSLr '^'^ ^J^:^' ^ '^^^ aO-PAGTwestem Wol 

QRP68 and OpIS are shown In Rg. 1 , c and £• the others «o 

ApcJ. 14-8-3. Anraxln I. Anraxin II. PGP9.6. Dj-1. GST-pl^ 
PQAM, a» descttbed elsBviiher«.* «»-pi. ana 

yahie of the protein spot. -Thatransfoimx-* toad -t-XIwasaDDted 

prote n and mRNA expression Iweb within the same sanvlee 

Speaiman comHalbn coefRdenl anateMlcn? 
Identify potenOally significant ooirelatloni) between geneSS p^eln 
expr833lon. we med an analytical strategy similar to Sm K 
cance analysb of njlcmairays) (17). «Wch i«cs a pem,uteHoMech- 
nkjoe to determine the slgnlflcance o( changes In^enn^l^ta^ 

coefficients tjetween gene and protein expression genes were ok 
changed first In such a way that permiitated cotr^on coefficient 
we^ ca culated based on psoudo pal,^ of genes and^SIS^ 
dtetdbu«on 01 pemujtated conelation coeffldents becanle stable Xr 
. !• ™' P'«««'"« *^ »h«» repeated 60 Hmee to 
obtain 60 sets of permutated conflation coefficients. For each of the 
60 pemujlatlons. the coaetallons of genes and pn^elns wem r^,^ 

' Chen ef a/., submitted for publlcailon. 
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Spot 


Unlgene 




1484 


Hs^ldlfi 


1 ADiA 


0967 


Ha 77899 


IrM 1 


0353 






0865 




vinPU 


1108 






1203 




TPlI 


0523 




KRTIS 


1492 


nS«pl910 


LAP18 






LAP18 


1101 


NOAMS' 

n3*7o22o 




udud 


- Ks^424G3 


KPfTB 


Hs29776d 


ViM 




H&297763 


VIM 


1874 


Hd.75313 


AKR1B1 




Hs»75S44 


YWMAH 


2524 
2324 


H8.7822$ 


AHXM 


H8,65114 


KRnri8 


1182 


HB41707 


H8PB3 


0360 
0892 


HsJ289101 


QRP68 


Hs.75313 


AKR1B1 


0861 


HdJ53l3 


AKR1B1 


0853 


H8.76313 


AKR1B1 


2603 


Hs.76392 


AL0H1 


0381 


HS76392 


ALDH1 


0371 


Ha.7e392 


ALOHI 


1179 


Hs.78225 ~ . 


ANXA1 


07^ 


H3.78225 


ANXA1 


0700 


H8.78225 


ANXAi 


2508 


Hs.217403 


ANXA2 


0772 


Hs.217493. 


ANXA2 


0723 


H8.217493 


ANXA2 


1239 


H8.93194 


AP0A1 


1Z37 


H3^3194 


APOA1 




Hs.93194 


AP0A1 




H8^6 


ATP5B 




Hs^S 


ATP5B 


Wis*! 




ATP5B 


fiiia.1 

vCKhS 


H8.78106 


CLU 




HS.75106 


CLU 


1S97 


119.119140 


EIF5A 




n8.i 19140 


EIFSA 


179R- 


H8.5241' 


FABP1 


1719 


n8^41 


FABP1 


0947 


H8.169476 


QAPO 


1232 


H9.76207 


GL01 


1220 


H8.75207 


QL01 


1695 


Hs.168300 


HAP1 


1810 


H8.75dgo 


HP 


1459 


H3.75990 


HP 


1458 


H8.75900 


HP 


0819 


H$.76990 


HP 


0815 


HS.759W . 


HP 


1250 


H8.41707 


HSPB3 


0549 


Hs.79037 


HSPD1 


0338 


Hs.78037 


HSPD1 


0333 


H8.79037 


HSPD1 


0331 


Hs.79037 


HSPD1 


2381 


Hs.65114 


KRT18 


0636 


H3.65114 


KRT18 



Protettvnam^ 



0.4003 
0.3930 
a3802 
0.3893 



a3395 
0.3335 
0.3234 
0.0164 
0L3102 
0.3049 
0.2939 
6.2809 
0.2790 

ojms 

0.2012 
0.2801 
0.2658 
02916 
-0^460 
0i>761 
-0i>876 
-0.0565 
—0.0371 
-0.0680 
0.2062 
-0.0739 
-0.0228 
0J222Z 
' 0.2080 
0.0701 
0.1133 
-0.0373 
-0.0894 
0.0080 
0.0122 
-0.0992 
-0.0483 
-0.0443 
-0,0726 
-0.0376 
-0:1916 
-0X>473 
0.1743 
0.2249 
0.0450 
-0X>137 
-0.4672 
0.0802 
-0.0305 
0.0481 
-0.0034 
-0.1024 
0.1074 
0.2265 
0.1383 
0.1603 
0.2016 
0.1106 



OP18($tethmln) 
Tropomyosfns 1-6 

Protease dlsuinde teomemse p5RP58) 
<3Jycorald8hyde'3i>ho3phate dehydrooonasd 
H8p27 

Trtoee phosphate isomemdo (TPS 

Cytokemtln 18 

OPl8(stathmlf4 

OPie(stalhmJn) 

Annexin variant I 

Cytokeratin 8 

VImwtfIn 

VlmenUn 

A)dos8 reductase ' ' 

14-3-3 -I, 

Annexin 1 

Cytokeratin 18 

Hsp27 

PhosphcHpase C (QRP68) 
Aldose reductase 
Aldose reductase 
Akbae reductase . 
Aldehyde dehydrooenaee 
Aldef^o dehydrogenase 
Aldehyde dehydrogenaee 
Annexin variant I 
Annexin I 
Annexin I 

t4pocotin (annexin 19 
MpecoUn (annexin II) 
Upocotin 

Apoltpoproleln A1 (ApoAl) 

A()o0poprotetn A1 (ApoAl) 

ApoB^oproteln A1 (ApoAl) 

ATP synthase eubunit precuraor 

ATP synthase fi subunit precursor 

ATP synthase p eubuntt precuisor 

Apollpoproteln J ^Vpo^) 

Apollpoproletn J (ApoJ) 

elF-5A 

8IF-6A 

L-FABP 

L-FABP 

OIycera!dehyde-3-pho8f>hate dehydrogenase 
Qlyoxelase-l 

Olyoxalase-I 

Huntlngtln-assoclated protein 1 (neuroan 1) 

crHaptoglobtn 

a-HaptogloWn 

a-Haplogtobin 

B-haptogtobIn 

B-haplogtebIn 

H8P27 

Hsp60 

HspSO 

Hspeo 

HspGO 

Cytokeratin 18 
Cytokomtin 18 
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Table O-contihaac/. 

CwetaUot} coefficients ofprotebt and mRNA whom wuMpfe Isofotms wen ptBsent on 2D geb 
I*, correlation coefficient value > 0^6; p < 0,05. Valued In boldface am slgnKteanl at p < OJ05. 



SpNOt 


Unlgene 


Gene name 


r* 




Ms 65114 

lYOiVv lit 


IStillO 


■ 

0*1279 


0528 


MajBBII^ 




0J0414 


0527 


Hfi A5114 

r|«.99 114 




. 0.0436 


0514 




KRTlo 


0.0733 


VHPI 


n9*Z4Z4o3 


KRTB 


-0.0111 


WWO 




tons 


0.0347 


tiAAA 


nS.Z424D3 


KRTO 


-0.1311 


u443 


He.z42463 


KR1B 


0.0942 


14oo 


H9.61915 


LJ^18 


0,0495 


v321 


He.7$666 


.P4HB 


--0.0546 


0320 


H8.756S5 


P4HB 


-0.0041 


1083 


H3.7G323 


PHB 


0.0441 


0837 


Hb.76323 


PHB 


0.1402 


0326 


Ha.297681 


$ERPINA1 


-OXK227 


0322 


H9^7681 


SEFtPINAI 


-0i>277 


0241 


H8^7681 


SERPINA1 


-0.0148 


1280 


He^1254 


SFTPA1 


-0.1488 


1278 


Hd^1254 


SFTPA1 


-02040 


0806 


Ha.73880 


TNMT1 


0.1162 


0778 


H9.73980 




0.0740 


1213 


Hd.83848 


TPn 


0.0024 


1210 


H3.83848 


TPI1 


0.0490 


1207 


He.83848 


TPII 


-0.1616 


1204 


Me«83848 


TPI1 


O.O209 




Hs.83848 


TPII 


0.0721 


1161 


Hd.83848 


TPII 


0.2266 


1062 


Ha.77889 


TPM1 


-0,1040 


1039 


l4s.77899 


TPM1 


-0^899 


1035 


H8.77899 


TPM1 


-0.3821 


0783 


H8.77899. 


TPM1 


0.0757 


1574 


H3.19438e 


TTR • 


-0.00S5 


0809 


H3.194366 


TTR 


0.0399 


2202 


H3.76118 


UCHL1 


-0.0220 


1246 


H3.76118 


UCHL1 


-0.1261 


1242 


Hs.76118 


UCHLI 


ai473 


0606 


Hs^7753 


VIM 


0.0951 


0594 


H8.297753 


VIM 


-0.2684 


0508 


Hs.297753 


VIM 


0.1008 


0419 


H3^97753 


VIM 


0.0032 


1270 


H8.75544 


YWHAH 


0.0059 



Protein name 



Cytokemtln 18 
Cytokeratln 18 
<^okerat!n 18 
C^okeratln 18 
Cytokeratin 8 
Cytokomtln 8 
C^okoTOtln 8 
Cytokeratin B 
0P18(StBthm1n) 
PDI (proty-4-OH-B) 
PDI(pro]y-4-OH-B) 
ProhfbiUn 
ProWbWn 
of-l-Antllripeln 
or-t-Antttrlpsln 
o-1-Antitripsln 

Pulmonary surfactant-associated protein 
Pulmonary surfactant-associated protein 
TtoponInT 
Trx>por)In T 

Triosa ptiosphate Isomemso (TPI) . 
TVtoso phosphate feomerase (TPQ 
TrtosB phosphate Isomeiase (IPQ 
Trtosa phosphate tsomerase fTPI) 
THose phosphate leomereas (TPO 
Titese phosphate l&omeraee (TPO 
Tropomysln dean-product 
Cytoskeleta) tropomyosin 
Tropomyoetn 
Tropomyosins 1-5 
Transthyretin 
Transthyretin mWUmero . 

Ubkiuftln caitoxyl-termlna! hydrolase Isozyme LI 
UbIquHIn cart>oxyl-temitna] hydrolase Isozyme LI 
UbIquKIn cart>oxyl-termln^ hydix^tase Isozyme LI 
Vlmentln 

VImentln-dertved protein (vld4) 
Vlmentin-dertved protein {vJd2) 
VImentln-deilved protein (vidl) 
14-3-31) 



such that p^(D denotes the fth largest correlatton opofftetent for plh 
pomurtatton. Hence, the expected oorratatlon coeffldenl, p^, was the 
aveiago over the 60 permutal Ions, prf) - , pJf/eX}. A acattor plot of 
obseived coTTBlaltons (pW) W9«us the €«pected oowBl^ 
Rg. 20. For this study, we chose threshoW A » 0.1 1 5 so that corr^on 
woukJ be oonsWered eJgnlficanl K absolute value of difference between 
p(0 and pfe(() was greater than the threshold. Twerrty-nlne (Including one 
. with observed correlatkm coetftetent -0.4672) of 1 65 palm of gene and 
protein expression were caJled significant In such crtterta. and the 
permuted data generated an average of 5.1 faJsely slgnHlcant pairs of 
gene and protain expressloa This provided an estimated false dis- 
covery rate (the percentage of pairs of gene and protein expression 
Identified by chance) for our data set. 

RESULTS 

Correlation of individual Proteins and mRNA Exprss^lon 
within Each Ti/mor-We have exambed quantitatively 165 



protein spots on 2D gels representing 98 genes and com- 
pared prote&i levels with mRNA levels for a cohort of 85 lung 
adenocarcinomas and rtomnal lung samples. Of the 165 pro- 
tein spots, 69 proteins were represented by only one known 
spot on 2D gels for an indMdual gene, whereas 96 protein 
spots shovired muWpfe protein products from 29 dtfferent 
genes. 2D Western blotting verified the proteins Identified by 
mass spectrometry w^en specific antilsodles were available. 
Speamian correlation coefficients of the proteins and their 
associated mRNA for each protein spot were generated using 
all 76 lung adenocarcinomas and .nine non-neoplastic lung 
tissues (see Tables I and II, and see Rgs. 1 and 2). The 
correlation coefficients (r) ranged from -0.467 to 0.442 (Rg. 
20}. A total of 28 protein spots (21 genes) were found to have 
a statistically significant con-elaUon between expression of 
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their protein and mRNA (r > 0.2445; p < 0.05). This accounts 
for 17% (28/165) of the 165 protein spots. Among the 69 
genes for which only a single protein spot was known (Table 
0. nine genes (9/89, 13%) were observed to show a statisti- 
cally slgnKkaant relaUonshlp between protein and mRlsiA 
abundance (r > 0Ji445; p < 0.05). The proteins whose ex- . 
presslon levels were conrelated with their mRNA abundance 
Included those Involved In signal transduction, carbohydrate 
met^Hsm, apopto^s, protein post-translatlona) nKKflflc^i- 
tlon, stmctural proteins, and heat shock proteins (Table HI). 

Individual Isofoms of the Same Protein Have Dftferent 
Protefn/mfiNA Co/m/affon Coe/ffcfenfs-Of the 165 protein 
spots, 96 represent protein products of 29 genes with at least 
two Isoforms. Among these 96 protein spots. 19 (19/08 pro- 
tein spots, 20%) showed a statistically significant oorrelatkwi 
between their protein and mRNA expression fr > 0.2445: P < 
0.05) (Table IQ and represented 12 genes (12/29, 41%). IncflvW- 
ual isoforms of the same protein demonstrated different 
protelnAtiRNA correlation coeffk;ients. For example, 2D-PAQE/ 
Western arialysls revealed four Isoforms of 0P18 c&ffenng In 
regards to Isoelectric point but similar rn molecular weight. 
Three of the four Isoforms (spots 1492. 1493. and 1494) showed 
a statistically significant con-elation between their protein and 
mRNA abundance (r = 0.3234. 0.3154, and 0.4003, respective- 
ly). The forth isoform (spot 1468) showed no correlatton be- 



tween protdn and mRNA expression (r = 0,0495). Slmllarty, just 
one of ftve quantffled isofonms of cytokeratln 8 (spot 439) dem- 
onstrated a statlstfcally slgnfUcant correlation between protein 
and mRNAabundance (r - 0.3049;p < 0.05) (Table II). 

In addition to differences In the relationship between mRlsiA 
levels and protein expression among separate Isoforms, some 
genes with very comparable mRNA levels showed a 24-fold 
difference !n their protein expression. Genes with comparable 
protein expression levels also showed i4> to a 2B-fpld vari- 
ance In their mRNA levels. 

Uck of Correlation fbrmRNA and Protein Exptvs^on when 
Using A)mmge Tumor Values across All 165 Proton Spots 08 
Gerres^-The relattonship between mRNA and protein expres- 
sion was also exaralned by using the average expression 
values for all samples. To analyze this relattonship using this 
approach, the average value for each protein or mRNA was 
generated using an 85 lung tissue samples. The range of • 
normalized avemge protein values ranged from ~ 0.0646 to 
0.0979 (raw value 0.0036 to 4.1947). and the range for mRNA 
was from 0 to 15260.5 for all 165 Individual protein spots. The 
Spearman correlation coefficient for the whole data set (165 
protein spota/98 genes) was - 0.025 (Rg. 3A). Even for the 28 
protein spots (Rg. 2D) that were found to have a statisUcally 
significant correlation between their mRNA and protein, use of 
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the average value resuhed in a correlation coefflcfent value of 
-0.035, which was not stgntflcant (Rg. 38). 

Uck of a Relationship between Protein/mRNA Conemon 
Coefficients and Average Pn>teln Abundance-To detem^w 
whether an absolute prolein level might Influence the corre- 
lation wHh mRNA. the mean value of each protein (relative 
abUKlance) and the Speannan prcrtelrVmRNA con-elation co- 
effldente among all 85 samples y^ere examined. No relation- 
ship between the protein abundance and the conrelation co- 
efftdente was bbseived (r = 0.039; p > 0.05). A detailed 
analysis of separate subsets of proteins with dWerf ng levels of 
abundance (less than -0.0014, larger than -0.0014, or larger 
than 0.0077) also showed a lack of conefatlon between mRNA 
and protein expression among the 83 (50%), 82 (50%), and 41 
(25%) of 166 total protein spots, respectively (r = 0.016, 0.08, 
and 0.1 72, respectively). 

Stage-reiaied Changes in the Protetn/mRNA Correlation 
Coefficients^Jo determine whether the 21 genes (28 protein 
spots) showing a significant con-elation between the protein 
and mRNA expression among all samples demonstrate 
changes in this relationship during tumor progression, the 
correlations were examined separately for stage I (n = 57) and 



stage 111 (n =^ 19) lung adenocardnomas (Table III). The num- 
ber of non-neoplastic lung samples fr? « 9) was Insuffldent for 
a separate con-elatton analysis of this gmup. Many of the ' 
protein spots represem one of several known protein isofomis 
for a given gene. The malority of genes (16/21) dM not differ in 
the proteln/mRNA congelation between stage I and st^e III 
tumors frKlicalIng a almOar regulatory relatfonsWp between the 
mRNA and protein spot. GRP-58, PSMC, SCiDJ. TPII, and 
VIM, however, were found to demonstrate significant differ- 
ences In the conelatlon coefOdents between stage I and 
stage 111 lung adenocarcinomas. For QRP-58, PSMC, and VIM 
the change In the congelation coeffldent was because of a 
relative Increase In protein expression In stage III tumors. For 
SOD and TPI the change resulted from a relative decrease In 
expression of this specific protein in stage III tumors. 

DISCUSSION 

Relatively little Is known about the regulatory mechanisms 
controIHng the complex pattems of protein abundance and 
post-translatlonal modification In tumors. Most reports con- 
cerning the regulation of protein translation have focused on 
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S^&<iepenaent anafysts of pnteffhrnfm cofre/atfon coeflte/snfs 
in boldface todlcate a sigrfflcant dHTersnce between stage I and stage lir. 



Spot 



Gene name 



no 



1874 

m4 



0963 
. 1314 
1405 
0655 
0350. 
0264 
1192 
0523 
0439 
1492 
1638 
1252 
11D4 
1454 
1203 
0957 
0593 



AKR161 

MXM 

AMXM 

ANXA6 

DJ-1 

RL 

GAPD 

QRP$8 

HNRPK 

HSPB3 

KRT18 

KRia 

IAM8 

LQAtSI 

PSMC 

SFN 

S0D1 

TPI1 

TPM1 

VIM 

YWHAH 



Function 



0.289 
0*184 
0.600 

. 0.241 
0.383 
0.126 
0.243 
0.627 
0.300 
0»457 
0.116 
0.326 
0.483 
O.20O 
0.283 
0.466 
0.352 
0.376 
0.475 

-0/W54 
0.283 



0.106 
0^2 
0.362 
0.380 
0.354 
0.358 
0^1 
-a0B7 
0.243 



0^71 
0>438 
0.663 
0.628 
a080 
0,475 
0.078 
0.009 
0.225 
0^ 
0^0 



Carbohydrate metabolism; electron tranaporter 
Ph08phonf>ase Inhibitor; ©fgnal transduction 
PhosphoBpaae inhibitor 

PhosphoDpase Inhibitor, calcium binding; phospholipid blndlns 
Signal transduction 
Iron storage protein 

Carbohydrate metabolism (glycolysis regulation) 
Signal transduction; protein dl&umde isomemse 
RNA^blndlng protein (RNA processing^nrwKfificiitloif 
Heat shock protein 
. Strxictura] protein 
Stiuoturel protein 

Signal transduction; cell growth and malhtenartce 
Apoptoste; cell adhedon; ceO size control 
Protein degradation 

Signal transduction (protein kinase C InWbltoi) 

OxWorsductase 

Carbohydrate metabolism 

Structural protein (muscle); control of heart 

Stnictural protein 

Signal transducUon 



om or several pit^eln products (18). Cells ef a/, (19) found a 
good oon^elation between transcript and protein levels among 
40 well resolved, abundant proteins using a proteomic and 
micraamay study of bladder cancer. By coniparing the roRNA 
and protein expression levels wfthfri the same tumor samples, 
we found that 17% (28/1 6^ of the protein spots (21/98 genes) 
show a statlsticalty significant corrslatton betv/een mRNA and 
protein. These pnotefrw appear to represent a diverse group of 
gene products and kidude those Involved In signal transduc- 
tfcwi, cartxjhydrate metabolism, proton modfflcatbn, ceK struo 
ture, heat shock, and apoptosls. Iheso results suggest that 
expressten of thb siA)set of 165 proteins Is likely to be regulated 
at the transcrlpttonal level In these tbeues. The ma|o% of the 
piotein boforma, however, did not cion-elate with mRNA tev^s, 
and thus their expression Is regulated by other mechanisms. We 
also observed a subset of proteins that demonstrated a nega- 
tive correlatkMi with the mRNA exptes^on values; for example 
a-haptogtobh demonstrated a sbwg negative oon-elaUon with 
Its mRNA express! on vakie?, TWs may reflect negative f eectoack 
on the mRNA or the protein or the presence of other regulatory 
Influences that are not understood cunently, 

Post-translatlohal nwdlflcatlon or prtjoesslng wHI result In 
Individual protein products of the same gene migrating to 
dWerenl k)cattons on 2D-PAQE gels (20). Because the Identity 
of all possible Isoforms for each protein examined has not 
been characterized completely, this may Influence the cone- 
latlon analyses perfomned In this study. This Is partly because 
of llmHatlons of the 2D-PAGE and mass spectrometry tech- 
nologies (21, 22). Potential Inconsistencies between mRNA 
and protein correlations that have been reported may also be 
because of differences, even In the same gene. In the mech- 



anisms of protein translatfon among, different cells or as 
measured In different laboratories (23). 

In this study, we examined 165 protein spots Wa^tlfled in 
lung adenocarcinomas. NInety-sN protein spots, representing 
the products of 29 geries, contained at toast two protein 
feofomis. Nineteen of 96 protein spots, representing 12 
genes, were shown to have a statlsllcally sIgnKlcam oonela- 
tton between their protein and mRNA expression, suggesting 
that the levels of these proteins reflects the transcrlptton of the 
corresponding genes. Differences In pnoteln/mRNA correlattens 
wens found among the hdlvWual Isoforms of aghfen protein. For 
example, of the four 0P18 Isofomis, three showed a statfsticaSy 
significant conelatton between the protein and mRNA expres- 
ston levels. The lack of relattonship fw the one Isofomi, how- 
ever, hdfcatesthat Indivkluai protein Isoforms of the same gene 
product can be legiitaled differentially. This b not unexpected 
and Bkely reflects other post-translatlonstf mechanfems that can 
Influence feofomi abindance In tissues and cartcer. 

In addrtton to the analyses of the correlatten of mRNA/ 
protein within the same tumor samples, we abo tested the 
global relationship between mRNA and the corresponding 
protein abundance across all 165 protein spots In the lung 
samples. A protein and mRNA average value for each gene 
was generated using all 85 lung tissues samples. We ob- 
sensed a very wide range ot nomialJzed average protein and 
mRNA values. The con'elatfoh coeffldenl generated using this 
average value data set was -0.026, and even for the 28 
protein spots that showed a statistically significant conflation 
between Individual mRNA and proteins, the con-elation value 
was only -0.035. This suggests that It Is not possible to 
predict overall protein expression levels based on average 
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The Qverall' correlattbn of 
mRNA and protein levels aoros9 all 
165 protein spots (4) and acioss 28 
protein spob ttiat contained buihrtd- 
ual r vahias larger Utan 0.244' (B) am 
shown. Each protein or mRNA mean 
value was ealculatod baaed on ail 76 
lung adenocanelnomas and nine non- 
neoplastic lung eamples usina quantlla- 
tlve 2D-PAQE and AfiyineWx oHgonu- 
deoUda micioairaya. the Spouman 
oorrolatlon coefflclems lor Ihe two dala 
sets (4 and B} were -0.025 and -a095, 
respectively. IndJcating a lackof cor^- 
Hon if mean values for mRNA and piotein 
for ail samples Is used. 
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mRNA abundance In lung cancer samples. This conclusion Is 
also supported by previous results from Anderson and Sell- 
hamer (24), who exmilned 19 genes In human liver cells, and 
by Gygl er a. (2^, who examined 106 genes In yeast Both 
studies found a lack of conrelatlon between mRNA and protein 
expression when avemge or overall levels were used. 

A good con^elatlon was reported when the 11 most abun- 
dant proteins were examined In yeast (25), suggesting that the 
level of protein abundance may be a factor that may Influence 
the con-elatlon between mRNA and protein. In the present 
study, a falrty wide range of mean prolein values among 165 
protein spots in lung adenocarcinomas was observed, and 
the conelatlon coefficients also varied from -0.467 to 0.442. 



. A compartson between the mean value of each protein and 
the correlation coofficfent generated using all 85 tissue aam^ 
pies did not reveal a strong relationship between the overaH 
protein abundance and the comBlatlon coefficients (r = 0 039- 
P > aog. Detailed analysis of dma;ent subsets of protelhdbiml 
dance also failed to show a correlation between mRNA and 
prolein expression. Thus In contrast to yeast, a relationship 
between mRNA/^imteln coirelatfcn coefficient and protein 
abundance In human lung adenocarcinomas was not observed. 

The results of this study Indicate that the level of protein 
abundance In lung adenocaiclnomas Is associated whh the 
corresfx>ndlng levels of mRNA In 17% (28 proteins) of the 
total 165 protein spots examined. This was substantially 
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higher than the amount predicted to result by chance alone 
(which was 6.1) and suggests that a transcriptional mecha- 
nism Skety underHes the abundance of these proteins In lung 
adenocarcinomas. We also demonstrate that the expression 
of Individual Isoforms of the same protein may or may not 
conflate with the mRNA, Indicating that separate and likely 
post-tenslallonal mechanisms account for ttw regulation of 
Isofoim abundance. These mechartsms may dso account for 
the dfferences In the correlaSon coeffldents obsen^ed between 
stage I and stage III tumors, Indteatlng that specHte proteh 
teofbnms show regulatory changes durbg tumor profession. 
Further studies h lung adenocarcinomas vM ex^ne the rela- ' 
tlbnsh^ between the expresston of hdMdua! protein Isofomis 
and specttkJ clInl9al-patfiok>gIcaI featiffes of these tumore, such 
as the presence of anglofymphatic Evasion, and nodal or pleu- 
ral surface Invdvement The potential to Identic specific piotein 
Isof ams assocteted with biological behavtor tn lung adenocar- 
cinomas woUd be of considerable ^terest and vwH add to our 
mderstandlng of the mgulailon of gene products by transcrip- 
tional, translatfonal. and post-translatlonal mecharysms. 
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Correlation between Protein and mRNA Abundance in Yeast 
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We have determined the relationship between mRNA and protein expression levels for selected genes 
expressed In the yeast Saccharomyces cerevisiae growing at mid-log phase. The proteins contained In total yeast 
cell lysate were separated by high-resolution two-dimensional (2D) gel electrophoresis. Over 150 protein spots 
were excised and identified by capillary liquid chromatography-tandem mass spectrometry (LC-MS/MS). 
Protein spots were quantified by metabolic labeling and scintillation counting. Corresponding mRNA levels 
were calculated from serial analysis of gene expression (SAGE) frequency tables (V. E. Velculescu, L. Zhang, 
W. Zhou, J. Vogelstein, M. A. Basrai, D. E. Bassett, Jr., P. Hieter, B. Vogelstein, and K. W. Kinzler, Cell 
88:243-251, 1997). We found that the correlation between mRNA and protein levels was insufficient to predict 
protein expression levels from quantitative mRNA data. Indeed, for some genes, while the mRNA levels were 
of the same value the protein levels varied more than 20-fold. Conversely, invariant steady-state levels of 
certain proteins were observed with respective mRNA transcript levels that varied by as much as 30-fold. 
Another interesting observation is that codon bias is not a predictor of either protein or mRNA levels. Our 
results clearly delineate the technical boundaries of current approaches for quantitative analysis of protein 
expression and reveal that simple deduction from mRNA transcript analysis is insufficient. 



The description of the state of a biological system by the 
quantitative measurement of the system constituents is an es- 
sential but largely unexplored area of biology. With recent 
technical advances including the development of differential 
display-PCR (21), of cDNA microarray and DNA chip tech- 
nology (20, 21), and of serial analysis of gene expression 
(SAGE) (34, 35), it is now feasible to establish global and 
quantitative mRNA expression profiles of cells and tissues in 
species for which the sequence of all the genes is known. 
However, there is emerging evidence which suggests that 
mRNA expression patterns are necessary but are by them- 
selves insufficient for the quantitative description of biological 
systems. This evidence includes discoveries of posttranscrifH 
tional mechanisms controlling the protein translation rate (15), 
the half-lives of specific proteins or mRNAs (33), and the 
intracellular location and molecular association of the protein 
products of expressed genes (32). 

Proteome analysis, defined as the analysis of the protein 
complement expressed by a genome (26), has been suggested 
as an approach to the quantitative description of the state of a 
biological system by the quantitative analysis of protein expres- 
sion profiles (36). Proteome analysis is conceptually attractive 
because of its potential to determine properties of biological 
systems that are not apparent by DNA or mRNA sequence 
analysis alone. Such properties include the quantity of protein 
expression, the subcellular location, the state of modification, 
and the association with ligands, as well as the rate of change 
with time of such properties. In contra.st to the genomes of a 
number of microorganisms (for a review, sec reference 11) and 
the transcriptome of Saccharomyces cerevisiae (35). which have 
been entirely determined, no proteome map has been com- 
pleted to date. 

The most common implementation of proteome analysis is 
the combination of two-dimensional gel electrophoresis (2DE) 
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(isoelectric focusing-sodium dodecyl sulfate [SDS]-polyacryl- 
amide gel electrophoresis) for the separation and quantitation 
of proteins with analytical methods for their identification. 
2DE permits the separation, visualization, and quantitation of 
thousands of proteins reproducibly on a single gel (18, 24). By 
itself, 2DE is strictly a descriptive technique. The combination 
of 2DE with protein analytical techniques has added the pos- 
sibility of establishing the identities of separated proteins (1, 2) 
and thus, in combination with quantitative mRNA analysis, of 
correlating quantitative protein and mRNA expression mea- 
surements of selected genes. 

The recent introduction of mass spectrometric protein anal- • 
ysis techniques has dramatically enhanced the throughput and 
sensitivity of protein identification to a level which now permits 
the large-scale analysis of proteins separated by 2DE. The 
techniques have reached a level of sensitivity that permits the 
identification of essentially any protein that is detectable in the 
gels by conventional protein staining (9, 29). Current protein 
analytical technology is based on the ma.ss spectrometric gen- 
eration of peptide fragment patterns that are idiotypic for the 
sequence of a protein. Protein identity is established by corre- 
lating such fragment patterns with sequence databases (10, 22, 
37). Sophisticated computer software (8) has automated the 
entire process such that proteins are routinely identified with 
no human interpretation of peptide fragment patterns. 

In this study, we have analyzed the mRNA and protein levels 
of a group of genes expressed in exponentially growing cells of 
the yeast S, cere\nsiae. Protein expression levels were quantified 
by metabolic labeling of the yeast proteins to a steady state, 
followed by 2DE and liquid scintillation counting of the se- 
lected, separated protein species. Separated proteins were 
identified by in-gel tryptic digestion of spots with subsequent 
analysis by microspray liquid chromatography-tandem mass 
spectrometry (LC-MS/MS) and sequence database searching. 
The corresponding mRNA transcript levels were calculated 
from SAGE frequenc7 tables (35). 

This study, for the first time, explores a quantitative com- 
parison of mRNA transcript and protein expression levels for 
a relatively large number of genes expre.s.scd in the .same met- 
abolic state. The resultant correlation is insufficient for prcdic- 
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FIG. 1. Schematic illustration ofprotcomc analysis by 2DE and mass spectrometry. In pan 1, proteins arc separated by2DE, stained spots are excised and subjected 
to in-gel digestion with trypsin, and the resulting peptides are separated by on-line capillary high-performance liquid chromatography. In part II, a peptide is shown 
eluting from the column in part 1. The peptide is ionized by electrospray ionization and enters the mass spectrometer. The mass of the ionized peptide is detected, and 
the first quadrupole mass filter allows only the specific niass-to-charge ratio of the selected peptide ion to pass into the collision cell. In the collision cell, the energized, 
ionized peptides collide with neutral argon gas molecules. Fragmentation of the peptide i.s essentially random but occurs mainly at the peptide bonds, resulting in smaller 
peptides of diifering lengths (masses). These peptide fragments are detected as a tandem mass (MS/MS) spectrum in the third quadrupole mass filter where two ion 
series are recorded simultaneously, one each from sequencing inward from the N and C termini of the peptide, respectively. In part III, the MS/MS spectrum from the 
selected, ionized peptide is compared to predicted tandem mass spectra computer generated from a sequence database. Provided that the peptide sequence exists in 
the database, the peptide and, by association, the protein from which the peptide was derived can be identified. Unambiguous protein identification is attained in a single 
analysis because multiple peptides arc identified as being derived from the same protein. 



tion of protein levels from mRNA transcript levels. We have 
also compared the relative amounts of protein and mRNA 
with the respective codon bias values for the corresponding 
genes. This comparison indicates that codon bias by itself is 
insufficient to accurately predict either the mRNA or the pro- 
tein expression levels of a gene. In addition, the results dem- 
onstrate that only highly expressed proteins are detectable by 
2DE separation of total cell lysates and that therefore the 
construction of complete proteomc maps with current technol- 
ogy will be very challenging, irrespective of the type of organ- 
ism. 

MATERIALS AND METHODS 

Veasi strain and growlh conditions. The source of protein and mcssnec tran- 
scripts for all experiments was YPM499 (MATa iini3-52 lys2'i<01 adc2'IOI 
ku2'M his3-2i2(K) irpl i^O?) (30). Logarithmically growing cells were obtained hy 
growing ycasi cells to early log phase (3 X ItJ'' cclls/ml) in YPi:) rich medium 
( YPD supplemented with 6 mM uracil, 4.S mM adenine, and 24 mM toptophan) 
at 3lf C (3S). Metabolic labeling of protein was accomplished in YPI) medium 



exactly as described elsewhere (4) with the exception that 1 ml of cells was 
labeled with 3 mCi to offset methionine present in YPD medium. Protein was 
harvested as described by Carrels and coworkers (12). Harvested protein was 
lyophilizcd, resuspended in isoelectric focusing gel rehydration solution, and 
stored at -WfC. 

2DE. Soluble proteins were run in the first dimension by using a commercial 
flatbed electrophoresis system (Muliiphor 11; Phamiacia Biotech). Immobilized 
polyacrylamidc gel (IPG) dry strips with nonlinear pH 3,0 to 10.0 gradients 
(Amcrsham-Pharmacia Biotech) were used for the first-dimension separation. 
Forty micrograms of protein from whole-cell lysates was mixed with IPG strip 
rehydration buffer (8 M urea, 2% Nonidet P-40, 10 mM dilhiolhreiiol), «nd 250 
to 380 ^1 of solution was added to individual lanes of an IPG strip rchydmtion 
tray (Amcrsham-Pharmacia Biotech). The strips were allowed lo rehvdraie a\ 
room temperature for I h. The stimples were run at 300 V-IO mA-5 W for 2 h, 
then ramped to 3^S00 V-IO mA-5 W over a period of 3 h. and then kept at 3.500 
V-IO mA-5 W for 15 lo 19 h. At the end of the first -dimension run (60 to 70 kV ■ 
h), the IPG strips were rccquilihraied for 8 niin in 2^*. (wt/vol) dilhiothrciiol in 
2^ (wi/vo!) SDS-6 M urca-30'^/?,- (wi/vol) glycerol-0.05 M Tris MCI (pl l 6.8) and 
for 4 min in 2.5% iodoacctamide in 2% (wi/vol) SDS-6 M urca>309;; (wi/vol) 
glyccrol-(U)5 M Tris HCI (pU 6.S). Following rccquilibralitm. the .strips were 
transferred and apposed lo I0%- p<ilyacrylamidc Nccond-dimonsion gels. Poly- 
acrylamidc gels were poured in a casting .siand with IO*i?. acrylamidc-2.67'i;; 
pipcrazine diacrylamide-^ 1.375 M Tris basc-HCI (pH 8.8)-A). t (wt/vol) SDS-<t.n5?f 
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FIG. 2. 2D silver>slained gel of the proteins in yeast total cell lysate. Proteins were separated in the first dimension (horizontal) by isoelectric focusing and then in 
the second dimension (vertical) by molecular weight sieving. Protein spots (156) were chosen to include the entire range of molecular weights, isoelectric focusing points, 
and staining intensities. Spots were excised, and the corresponding protein was identified by mass spectrometry and database searching. The spots are labeled on the 
gel and concspond to Ihc data presented in Tabic 1 . Molecular wciglits arc given in thousands. 



(wt/vol) ammonium pcrsuIfatc-0.05% TEMED (NA'^'^'-ictraniethylethyl- 
cnediamine) in Milli-0 water. The apparatus used to run second-dimension gels 
was a noncommercial apparatus from Oxford Glycoscicnces, Inc. Once the IPG 
strips were apposed to the second-dimension gels, they were immediately run at 
5(1 mA (constant )-500 V-85 W for 20 min, followed by 200 mA (const ant )-500 
V-85 W until the buffer front line was 10 to 15 mm from the bottom of the gel. 
Gels were removed and silver stained according to the procedure of Shcvchcnko 
ct al. (29). 

Protein identification. Gels were exposed to X-ray film overnight, and then the 
silver staining and film were used to excise 156 spots of vaiying intensities, 
molecular weights, and isoelectric focusing points. In order to increase the 
dclection limit by mass spectrometry, spots were cut out and pi>olcd from up to 
lour identical cold, silver-stained gels. In-gel tryptic digests of pooled spots were 
performed as described prcviou.«;ly (29). Tr>ptic peptides were analyzed by mi- 
crocapillary LC-MS with automated switching to MS/MS mode lor peptide 
fragmcniaiion. S|>cctra were searched against the composite OWL protein se- 
quence database (version 30.2: 250,514 protein sequences) (24a) by using the 
computer program Scqucsl (8), which matches theoretical and acquired tandem 
mass spectra. A proicitt match was determined by comparing the number of 
peptides idunliticd and ihcir respective cross-coriclaiion scores. All protein 
idcniitications were vcriticd by ct)mpari.<on with theoretical molecular weights 
and isiKlcclric points. 



mRNA quantitation. Velculcscu and coworkers have previously generated 
frequency tables for yeast mRNA transcripts from the same strain grown under 
the same stated conditions as described herein (35). The SAGE technology is 
based on two main principles. First, a short sequence lag (15 bp) that contains 
suflicicni information uniquely to identify a transcript is generated. A single lag 
is usually generated from each mRNA transcript in the cell which corresponds to 
15 bp at the 3'-most cutting site for NlaUL Second, many transcript tags can be 
concatenated into a single molecule and then sequenced, revealing the identity of 
multiple tags simultaneously. Over 20,000 transcripts were sequenced from ycasl 
strain YPH49y growing at mid-log phase on glucose. Assuming the previously 
derived estimate of 15,000 mRNA molecules per cell (16), this would represent 
a I. .1-fold coverage even for mRNA molecules present at a single copy per cell 
and would provide a 72^' probability of detecting such transcripts. Computer 
software which took for input the gene delected, examined the nucleotide se- 
quence, and performed the calculation as described by Velculcscu and coworkers 
(.15) was written. In practice* wc found that for 21 of 128 (16%) genes examined 
viable mRNA levels from SAGE data could not be calculated. This was because 
(i) no CATG site was found in the open reading frame (ORE), (ii) a CATG site 
w;is found but the corrcspiuiding 10-bp putative SAGE lag was not found in the 
frequency tables, or (iii) identical putative SAGE tags were present for multiple 
genes (e.g.. ri)H2^YEAS*l and 'rnH3_YEAST). 
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TABLE 1. Expressed genes identified from 2D gel in Fig. 2 TABLE 1 — Continued 



Mol wt 




Spot no. 


I ru gene 

name' 


Protein 

(10^ copies/ 
cell) 


mRNA 
abundance 
(copies/cell) 


Oxlon 
bias 


17,259 


6.75 


133 


CPRl 


15.2 


61.7 


0.769 


18,702 


4.80 


83 


EGD2 


20.1 


5.2 


0.724 


18,726 


4.44 


147 


YKL056C 


61.2 


88.4 


0.831 


18,978 


5.95 


135 


YER067W 


3.7 


6.7 


0.118 


19,108 


5.04 


130 


YLR109W 


94.4 


9.7 


0.680 


19,681 


9.08 


136 


ATP7 


11.0 


NA*^-^ 


0.246 


20,505 


6.07 


111 


GUKl 


16.5 


3.7 


0.422 


21,444 


5.25 


148 


SARI 


5.4 


10.4 


0.455 


21,583 


4.98 


95 


TSAl 


110.6 


40.1 


0.845 


22,602 


4.30 


80 


EFBl 


66.1 


23.8 


0.875 


23,079 


6.29 


112 


SOD2 


1Z6 


2.2 


0351 


23,743 


5.44 


137 


HSP26 


NA** 


0.7 


0.434 


24,033 


5.97 


96 


ADKl 


17.4 


16.4 


0.656 


24,058 


4.43 


143 


YKLinW 


29.2 


10.4 


0.339 


24,353 


6.30 


140 


TFSl 


8.1 


0-7 


0-146 


24,662 
24,808 


5.85 


99 


URA5 


25.4 


6.0 


0359 


6.33 


97 


GSPl 


26.3 


5.2 


0.735 


24,908 


8.73 


122 


RPS5 


18.6 


NA^ 


0.899 


25,081 


4.65 


81 


MRP8 


9.3 


NA^ 


0.241 


25,960 


6-06 


116 


RPEl 


5.8 


0.7 


0.372 


' 26,378 


9-55 


127 


RPS3 


96.8 


NA" 


0.863 


26,467 


5.18 


100 


VMA4 


10.5 


3.7 


0.427 


26,66] 


5.84- 


98 


TPIl 


NA*' 


NA'^ 


0.900 


27,156 


5.56 


93 


PRE8 


6.9 


0.7 


0.129 


27,334 


6.13 


115 


YHR049W 


18.4 


2.2 


0.520 


27,472 


5.33 


92 


YNLOlOW 


31.6 


3.7 


0-421 


27,480 


8.95 


123 


GPMl 


10.0 


169.4 


0.902 


27,480 


8.95 


124 


GPMl 


231.4 


169.4 


0.902 


27,480 


8.95 


125 


GPMl 


7.5 


169.4 


0.902 


27,809 


5.97 


139 


HOR2 


5.7 


0.7 


0.381 


27,874 


4,46 


78 


YSTl 


13.6 


. 52.8 


0.805 


28,595 


4.51 


41 


PUP2 


4.4 


0.7 


0.147 


29,156 


6.59. 


114 


YMR226C 


14.5 


2.2 


0.283 


29,244 


8.40 


120 


DPMI 


5.0 


11.2 


0362 


29,443 


5.91 


48 


PRE4 


3.4 


3.7 


0.162 


30,012 


6.39 


138 


PRBl 


21.2 


1.5 


0.449 


• 30,073 


4.63 


77 


BMHl 


14.7 


28.2 


0.454 


30,296 


7.94 


121 


OMP2 


67.4 


41.6 


0.499 


30,435 


6.34 


89 


GPPl 


70.2 


11.2 


0.703 


31,332 


5.57 


88 


1LV6 


13.9 


3.0 


0-402 


32,159 


5.46 


113 


IPPl 


63.1 


3.7 


0.752 


32,263 


6.00 


149 


HISl 


22.4 


4.5 


0-232 


33,311-^ 


5.35 


84 


SPE3 


15.1 


6.7 


0-4()8 


34,465 


5.60 


129 


ADEl 


8.7 


5.2 


0.305 


34,762 


5.32 


85 


SEC14 


10.9 


6.0 


0.373 


34,797 


5.85 


42 


URAl 


49.5 


8-9 


0.237 


34,799 


6.04 


90 


BELl 


103.2. 


81.0 


0.875 


35,556 


5.97 


43 


YDL124W 


6.4 


4.5 


0.206 


35,6)9 


8.41 


59 


TDHl 


69.8 


32.7^ 


0.940 


35,650 


5.49 


68 


CARl 


5.2 


3.0 


0.339 


35,712 


6.72 


117 


TDH2 


49.6 


473.0^ 


0-982 


35,712 


6.72 


1.54 


TDH2 


. 863.5 


473.0^ 


0.982 


35,712 


6.72 


155 


TDH2 


79.4 


473.0^ 


0.982 


36,272 


4.85 


128 


APAl 


8.7 


0.7 


0.425 


36,358 


5.05 


75 


YJR105W 


17.6 


17.1 


0.522 


36.358 


5.05 


76 


YJRI05W 


27.5 


17.1 


0.522 


36.596 


6.37 


79 


ADH2 


58.9 


260.tr 


0.711 


36.714 


(^.30 


102 


ADHI 


746.1 


260.0 


0.913 


36.714 


6.30 


103 


ADHl 


17.6 


260.0 


0.913 


36,714 


6.30 


104 


ADHI 


61.4 


260.0 


0.913 


36.714 


6.30 


105 


ADHl 


52:7 


260.0 


0-913 


37.033 


6.23 


44 


TALI 


44.8 


3.7 


0-701 


37.796 


7.36 


.^7 


1DH2 


29.4 


6.7 


0.330 


37.886 


6.49 


106 


1LV5 


76.0 


4.5 


0.892 


38.700 


7.83 


55 


BATi 


30.9 


11.2 


0.469 


38,702 


6.24 


46 


0CR2 


NA*' 


2.2 


0326 



Mol wt 


Pl 


Spot no. 


iru gene 


Protein 
abundance 
^10^ copies/ 
cell) 


mRNA 
abundance 
(copies/cell) 


Codon 


39,477 


538 


86 


FBAl 


17.8 


183.6 


0.935 


39,477 


538 


87 


FBAl 


427.2 


183.6 


0.935 


39,540 


630- 


150 


HOM2 


60.3 


43 


0392 


39,561 


6.12 


156 


PSAl 


96.4 


273 


0.718 


41,158 


6.01 


49 


YNL134C 


14.9 


13 


0316 


41,623 


7.18 


58 


BAT2 


19.0 


8.9 


0.250 


41,728 


7.29 


110 


ERGIO 


24.1 


43 


0343 


41,900 


5.42 


74 


TOM40 


22.3 


2.2 


0.375 


42,402 


6.29 


45 


CYS3 


6.7 


8.9 


0.621 


42,883 


5.63 


67 


DYSl 


15.8 


5.2 


0326 


43,409 


631 


107 


SERl 


103 


13 


0.292 


43,421 


539 


91 


ERG6 


2.2 


14.1 


0.408 


44,174 


7.32 


56 


YBR025C 


13.1 


6.0 


0.684 


44,682 


4.99 


72 


TIFl 


2.9 


39.4 


0.834 


44,707 


7.77 


108 


PGKl 


23.7 


165.7 


0.897 


44,707 


7.77 


109 


PGKl 


315.2 


165.7 


0.897 


46,080 


6.72 


30 


CAR2 


' 15.4 


NA' 


0.495 


46,383 


832 


53 


IDPl 


7.7 


0.7 


0.436 


46,553 


5.98 


47 


IDP2 


32.4 


NA' 


0.197 


46,679 


6.39 


50 


ENOl 


35.4 


0.7 


0.930 


46,679 


6.39 


51 


ENOl 


6.6 


0.7 


0.930 


46,679 


6.39 


52 


ENOl 


2.2 


0.7 


0.930 


46,773 


5.82 


63 


EN02 


153 


■ 289.1 


0.960 


46,773 


5.82 


64 


EN02 


6353 


289.1 


0.960 


46,773 


5.82 


65 


EN02 


93.0 


289.1 


0.960 


46,773 


5.82 


66 


EN02 


31.0 


289-1 


0.960 


47,402 


6.09 


126 


CORl . 


23 


0.7 


,0.422 


47,666 


8.98 


54 


AAT2 


11.7 


6.0 


0.338 


48364 


5.25 


73 


WTMl 


743 


13.4 


0.365 


48,530 


6.20 


61 


MET17 


38.1 


29.0 


0376 


48,904 


5.18 


69 


LYS9 


16.2 


3.7 


0.463 


48,987 


4.90 


153 


SUP45 


29.6 


11.9 


0.377 


49,727 


5.47 


70 


PR02 ' 


13.6 


5.2 


0.297 


49,912 


9.27 


62 


TEF2 


5583 


282.0 


0.932 


50,444 


5.67 


35 


YDR190C 


*4.8 


2.2 


0.228 


50,837 


6.11 


32 


YEL047C 


3.8 


13 


0.387 


50.891 


439 


151 


TUB2 


11.2 


7.4 


0.404 


51,547 


6.80 


27 


LPDl 


18.9 


2.2 


0.351 


52,216 


7.25 


29 


SHM2 


19.7 


7.4 . 


0.722 


52,859 


534 


37 


YFR044C 


30.2 


6.7 


0.442 


53,798 


5.19 


71 


HXK2 


26.5 


7-4 


0.756 


53,803 


6.05 


145 


GYP6 


4.4 


0.7 


0.147 


54,403 


5.29 


39 


ALD6 


37-7 


2.2 


0.664 


54,403 


5.29 


40 


ALD6 


6-6 


2.2 


0.664 


54,502 


6.20 


31 


ADE13 


6.3 


13 


0.417 


54,543 


7.75 


25 


PYKl 


2253 


101.8 


0.965 


54,543 


7.75 


26 


PYKl 


39.8 


101.8 


0.965 


55,221 


6.66 


146 


YEL071W 


163 


3.0 


0.244 


55,295 


4.35 


134 


PDIl 


66-2 


14.1 


0389 


55,364 


5.98 


24 


GLKl 


22.6 


6.0 


0.237 


55,481 


7.97 


118 


ATPl 


21.6 


2.2 


0.637 


55,886 


6.47 


28 


CYS4 


22.2 


NA" 


0.444 


56.167 


5.83 


33 


AR08 


143 


3.0 


0.324 


56.167 


5.83 


34 


AR08 


9.1 


3.0 


0324 


56384 


6.36 


20 


CYB2 


18.9 


NA'" 


0.259 


57366 


5.53 


60 


FRS2 


23 


0.7 


0.451 


57383 


5.98 


144 


ZWFI 


5.6 


0.7 


0.215 


57.464 


5.49 


36 


THR4 


21.4 


3.7 


0.508 


57,512 


5.50 


7 


SRV2 


63 


NA' 


0.260 


57,727 


4.92 


1-52 


VMA2 


.33.7 


8.9 


0.546 


58,573 


6.47 


17 


ACHI 


4.4 


13 


0.327 


58,573 


6.47 


18 


ACHI 


5.4 


13 


0.327 


61,353 


5.87 


21 


PDCl 


6.5 


200.7 


0.962 


61,353 


5.87 


22 


PDCl 


.303.2 


200.7 


0.962 


61,353 


5.87 


23 


PDCl 


163 


200.7 


0.962 


61,649 


5.54 


38 


CCT8 


2.2 


13 


0.271 



Conlimufd 



Continued on jollowhv^ pat^c 



1724 GYGI ET AL. 



MoL. Cell. Biou 



TABLE A'-Continued 



Mol wt 




Spot no. 


YPD gene 
name' 


Protein 
abundance 
(10^ copies/ 
cell) 


mRNA 
abundance 
(copies/cell) 


Codon 
bias 


61,902 


6.21 


101 


PDC5 


4.3 


NA 


0-828 


62,266 


6.19 


16 


ICLl 


20.1 


NA*^ 


0-327 


62,862 


8.02 


19 


1LV3 


5.3 


4.5 


0.548 


63,082 


6.40 


119 


PGM2 


2.2 


3.0 


0.402 


HA IOC 

64,335 


5.77 


5 


PABl 


30.4 


1.5 


U.DIO 


66,120 


5.42 


8 


STll 


6.7 


0.7 


0.313 


66,120 


5.42 


9 


STll . 


6.4 


0.7 


0.313 


66,450 


5-29 


141 


SSB2 


7.0 


NA^ 


0.880 


66,450 


5.29 


142 


SSB2 


2.3 


NA^ 


0.880 


66,456 


5.23 


10 


SSBl 


64.5 


79.5 


0.907 


66,456 


5.23 


11 


SSBl 


59.0 


79.5 


0.907 


66,456 


5.23 


12 


SSBl 


13.7 


79.5 


0.907 




3.oZ 


oZ 


1 T71 ^A 

IJHJ4 


3.1 


3.U 


ft A(Y1 


69,313 


4.90 


13 


SSA2 


24.3 


18.6 


0.892 


69,313 


4.90 


14 


SSA2 


77.1 


18.6 


0.892 


74,378 


8.46 


15 


YKL029C 


2.8 


3.7 


0.353 


75,396 


5.82 


6 


GRSl 


5.5 


7.4 


0.500 


85,720 


6.25 


1 


MET6 


2.0 


NA^ 


0.772 


85,720 


6.25 


2 


MET6 


10.9 . 


NA'^ 


0.772 


85,720 


6.25 


3 


MET6 


1.4 


NA^ 


0.772 


93,276 


6.11 


131 


Einri 


17.9 


41.6 


0.890 


93,276 


6.11 


132 


EFn 


. 5.7 


41.6 


0.890 


102,064' 


6.6r 


94 


ADE3 * 


4.8 


5.2 


0:423 


107,482'^ 


5.33'' 


4 


MCM3 


2.7 


NA'^ 


0.240 



* YPD gene names are available from the YPD website (39). 

* NA, calculation could not be performed or was not available. 
mRNA data inconclusive or NA. 

No methionines in predicted ORF; therefore, protein concentration was not 
determined. 

Measured molecular weight or pi did not match theoretical molecular weight 
or pl. 



Protein quantitation. [*^^S|methionine-Iahcled gels were exposed to X-ray Him 
overnight, and then the silver .stain and film were used to excise 156 spots of 
varying intensities, molecular weights, and pis. The excised spots were placed in 

0.6-ml microcentrifuge tulxrs, and scintillation cocktail (UK) \iX) was added. The 
samples were vortexcd and counted. In addition, two parallel gels were cleciro- 
hlottcd to polyvinylidcnc difluoridc membranes. The membranes vvcrc exposed 
to X-ray film, and four intense single spots were excised from each membrane 
and subjected to amino acid analysis." For these four spots, a mean of 209 ± 4 
cpm/pmol of protcin/mcthionine was found. This number was used to quantitatc 
all remaining spots in conjunction with the number of methionines present in the 
protein. 

To ensure that proteins were labeled to equilibrium, parallel 2D gels were 
prepared and run on yeast metabolically labeled for I, 2, 6, or 18 h. The 
corresponding 156 spots were excised from each gel, and radioactivity was mea- 
sured by liquid scintillation counting for each spot. Calculated protein levels were 
highly reproducible for all time points measured after 1 h. 

Calculation of codon bias and predicted half-life. Codon bias values were 
extracted from the YPD spread.shcei (17). Protein half-lives were calculated 
based on the N-cnd rule (33). When the N-terminal processing was not known 
experimentally, it was predicted based on the aflinity of methionine aminopep- 
tidasc (31). ' 

RESULTS 

Characteristics of proteonie approach. Nearly every facet of 
proteome analysis hinges on the unambiguous identification of 
large numbers of expressed proteins in cells. Several tech- 
niques have been described previously for the identification of 
proteins separated by 2DE, including N-terminal and internal 
sequencing (1, 2), amino acid analysis (38), and more recently 
mass spectrometry (25). We utilized lechniques based on mass 
spectrometry because they afford the highest levels of sensitiv- 
ity and provide unambiguous identification. The specific pro- 
cedure used is schematically illustrated in Fig. I and is ba.sed 
on three principles. First, proteins arc removed from the gel by 



proteolytic in-gel digestion, and the resulting peptides are sep- 
arated by on-line capillary high-performance liquid chromatog- 
raphy. Second, the eluting peptides are ionized and detected, and 
the specific peptide ions are selected and fragmented by the 
mass spectrometer. To achieve this, the mass spectrometer 
switches between the MS mode (for peptide mass identifica- 
tion) and the MS/MS mode (for peptide characterization and 
sequencing). Selected peptides are fragmented by a process 
called collision-induced dissociation (CID) to generate a tan- 
dem mass spectrum (MS/MS spectrum) that contains the pep- 
tide sequence information. Third, individual CID mass spectra 
are then compared by computer algorithms to predicted spec- 
tra from a sequence database. This results in the identification 
of the peptide and, by association, the protein(s) in the spot. 
Unambiguous protein identification is attained in a single anal- 
ysis by the detection of multiple peptides derived from the 
same protein. 

Protein identification. Yeast total cell protein lysate (40 ^.g), 
metabolically labeled with p^S]methionine, was electro- 
phoretically separated by isoelectric focusing in the first dimen- 
sion and SDS-10% polyacrylamide gel electrophoresis in 
the second dimension. Proteins were visualized by silver stain- 
ing and by autoradiography. Of the more than 1,000 proteins 
visible by silver staining, 156 spots were excised from the gel 
and subjected to in-gel tryptic digestion, and the resulting 
peptides were analyzed and identified by microspray LC- 
MS/MS techniques as described above. The proteins in this 
study were all identified automatically by computer software 
with no human interpretation of mass spectra. They are indi- 
cated in Fig. 2 and detailed in Table 1. 

The CID spectra shown in Fig. 3 indicate that the quality of 
the identification data generated was suitable for unambiguous 
protein identification. The spectra represent the amino acid 
sequences of tryptic peptides NSGDIVNLGSIAGR (Fig. 3A) 
and FAVGAFTDSLR (Fig. 3B). Both peptides were derived 
from protein S57593 (hypothetical protein YMR226C), which 
migrated to spot 114 (molecular weight, 29,156; pl, 6.59) in the 
2D gel in Fig. 2. Five other peptides from the same analysis 
were also computer matched to the same protein sequence. 

Protein and mRNA quantitation. For the 156 genes investi- 
gated, the protein expression levels ranged from 2,200 (PGM2) 
to 863,000 (TDH2/TDH3) copies/cell. The levels of mRNA for 
each of the genes identified were calculated from SAGE fre- 
quency tables (35). These tables contain the mRNA levels for 
4,665 genes in yeast strain YPH499 grown to mid-log phase in 
YPD medium on glucose as a carbon source. In some in- 
stances, the mRNA levels could not be calculated for reasons 
stated in Materials and Methods. For the proteins analyzed in 
this study, mean transcript levels varied from 0.7 to 473 copies/ 
cell. 

Selection of the sample population for mRNA-protein ex- 
pression level correlation. The protein spots selected for iden- 
tification were selected from spots visible by silver staining in 
the 2D gel. An attempt was made not to include spots where 
overlap with other spots was readily apparent. The number of 
proteins identified was 156 (Table 1). Some proteins migrated 
to more than one spot (presumably due to differential protein 
processing or modifications), and protein levels from these 
spots were calculated by integrating the intensities of the dif- 
ferent spots. The 156 protein spots analyzed represented the 
products of 128 different genes. Genes were excluded from the 
correlation analysis only if part of the data set was missing; i.e., 
genes were excluded if (i) no mRNA expression data were 
available for the protein or putative SAGE tags were ambig- 
uous,, (ii) the amino acid sequence did not contain methionine, 
(iii) more than a single protein was conclu.^ivcly identified as 
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FIG. 3. Tandem mass (MS/MS) ^ctra resulting from analysis of a single spot on a 2D gel. The first quadrupolc selected a single mass-to-charge ratio (m/z) of 687.2 
(A) or 592.6 (B), while the collision cell was filled with argon gas, and a voltage which caused the peptide to undergo fragmentation by CID was applied. The third 
quadrupole scanned the mass range from SO to 1,400 m/z. The computer program Scquest (8) was iitilizcd to match MS/MS spectra to amino acid sequence by database 
searching. Both spectra matched peptides from the same protein. S.S7S93 (yeast hypothetical protein YMR226C). Five other peptides from the same analysis were 
matched to the same protein. 



migrating to the same gel spot, or (iv) the theoretical and 
observed pis and molecular weights could not be reconciled. 
After these criteria were applied, the number of genes u.*ied in 
the correlation analysis was 106. 



Codon bias and predicted halMives. Codon bias is thought 
to be an indicator of protein expression, with highly expressed 
proteins having large codon bias values. The codon bias distri- 
bution for the entire set of more than 6,0(X) predicted yeast 
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gene ORFs is presented in Fig. 4A. The interval with the 
largest frequency of genes is between the codon bias values of 
0.0 and 0.1. This segment contains more than 2^00 genes. The 
distribution of the codon bias values of the 128 different genes 
found in this study (all protein spots from Fig. 2) is shown in 
Fig. 4B, and protein half-lives (predicted from applying the 
N-end rule [33] to the experimentally determined or predicted 
protein N termini) are shown in Fig. 4C. No genes were iden- 
tified with codon bias values less than 0.1 even though thou- 
sands of genes exist in this category. In addition, nearly all of 
the proteins identified had long predicted half-lives (greater 
than 30 h). 

Correlation of mRNA and protein expression levels. The 
correlation between mRNA and protein levels of the genes 
selected as described above is shown in Fig. 5. For the entire 
jgroup (106 genes) for which a complete data set was gener- 
ated, there was a general trend of increased protein levels 
resulting from increased mRNA levels. The Pearson product 
moment correlation coefficient for the whole data set (106 
genes) was 0.935. This number is highly biased by a small 
number of genes with very large protein and message levels. A 
more representative subset of the data is shown in the inset of 
Fig. 5. It shows genes for which the message level was below 10 
copies/cell and includes 69% (73 of 106 genes) of the data used 
in the study. The Pearson product moment correlation coeffi- 
cient for this data set was only 0.356. We also found that levels 
of protein expression coded for by mRNA with comparable 
abundance varied by as much as 30-fold and that the mRNA 
levels coding for proteins with comparable expression levels 
varied by as much as 20-fold. 

The distortion of the correlation value induced by the un- 
even distribution of the data points along thejc axis is further 
demonstrated by the analysis in Fig. 6. The 106 samples in- 
cluded in the study were ranked by protein abundance, and the 
Pearson product moment correlation coefficient was repeat- 
edly calculated after including progressively more, and higher- 
abundance, proteins in each calculation. The correlation values 
remained relatively stable in the range of 0.1 to 0.4 if the 
lowest -expressed 40 to 95 proteins used in this study were 
included. However^ the correlation value steadily climbed by 
the inclusion of each of the 11 very highly expressed proteins. 

Correlation of protein and mRNA expression levels with 
codon bias. Codon bias is the propensity for a gene to utilize 
the same codon to encode an amino acid even though other 
codons would insert the identical amino acid in the growing 
polypeptide sequence. It is further thought that highly ex- 
pressed proteins have large codon biases (3). To assess the 
value of codon bias for predicting mRNA and protein levels in 
exponentially growing yeast cells, we plotted the two experi- 
mental sets of data versus the codon bias (Fig. 7). The distri- 
bution patterns for both mRNA and protein levels with respect 
to codon bias were highly similar. There was high variability in 
the djita within the codon bias range of 0.8 to 1.0. Although a 
large codon bias generally resulted in higher protein and mes- 
sage expression levels, codon bias did not appear to be predic- 
tive of cither protein levels or mRNA levels in the cell. 

DISCUSSION 

The desired end point for the description of a biological 
system is not the analysis of mRNA transcript levels alone but 
also the accurate measurement of protein expression levels and 
their respective activities. Quantitative analysis of global 
mRNA levels currently is a preferred method for the analysis 
of the stale of cells and tissues (11). Several methods which 
either provide absolute mRNA abundance (34, 35) or relative 
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mRNA levels in comparative analyses (20, 27) have been de- 
scribed elsewhere. The techniques are fas't and exquisitely sen- 
sitive and can provide mRNA abundance for potentially any 
expressed gene. Measured mRNA levels are often implicitly or 
explicitly extrapolated to indicate the levels of activity of the 
corresponding protein in the cell. Quantitative analysis of pro- 
tein expression levels (proteome analysis) is much more time- 
consuming because proteins are analyzed sequentially one by 
one and is not general because analyses are limited to the 
relatively highly expressed proteins. Proteome analysis does, 
however, provide types of data that are of critical importance 
for the description of the slate of a biological system and that 
are not readily appaicnl from the sequence and the level of 
expression of the mRNA tianscript. This study attempts to 
examine the relationship between mRNA and protein expres- 
sion levels for a large number of expressed genes in cells 
representing the .same state. 

Limits in the sensitivity of current protein analysis technol- 
ogy precluded a completely random .sampling of yeast proteins. 
We therefore based the study on those proteins visible by silver 
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FIG. 5. Correlation between protein and mRNA levels for 106 genes in yeast growing at log phase with glucose as a carbon source. mRNA and protein levels were 
calculated as described in Materials and Methods. The data represent a population of genes with protein expression levels visible by silver staining on a 2D gel chosen 
to include the entire range of molecular weights, isoelectric focusing points, and staining intensities. The inset shows the low-end portion of the main figure. It contains 
69% of the original data set. The Pearson product moment correlation for the entire data set was 0.935. The correlation for the inset containing 73 proteins (69%) was 
only 0.356. 



Staining on a 2D gel. Of the more than 1,000 visible spots, 156 
were chosen to include the entire range of molecular weights, 
isoelectric focusing points, and staining intensities displayed on 
the 2D protein pattern. The genes identified in this study 
shared a number of properties. First, all of the proteins in this 
study had a codon bias of greater than 0.1 and 93% were 
greater than 0.2 (Fig. 4B). Second, with few exceptions, the 
proteins in this study had long predicted half-lives according to 
the N-end rule (Fig. 4C). Third, low-abundance proteins with 
regulatoiy functions such as transcription factors or protein 
kinases were not identified. 

Because the population of proteins used in this study ap- 
pears to be fairly homogeneous with respect to predicted half- 
fife and codon bias, it might be expected that the correlation of 
the mRNA and protein expression levels would be stronger for 
this population than for a random sample of yeast proteins. We 
tested this assumption by evaluating the correlation value if 
different subsets of the available data were included in the 
calculation. The J 06 proteins were ranked from lowest to high- 
est protein expression level, and the trend in the correlation 
value was evaluated by progressively including more of the 
higher-abundance proieins in the calculation (Fig. 6). The cor- 
relation value when only the lower-abundance 40 to 93 pro- 
teins were examined was consistently between 0.1 and 0.4. Jf 
the 11 most abundant proteins were included, the correlation 
steadily increased to 0.94. We therefore expect that the corre- 
lation for all yeast proteins or for a random selection would be 
less than 0.4. The observed level of correlation between 
mRNA and protein expression levels suggests the importance 



of posttranslational mechanisms controlling gene expression. 
Such mechanisms include translational control (15) and con- 
trol of protein half-life (33). Since these mechanisms are also 
active in higher eukaryotic cells, we speculate that there is no 
predictive correlation between steady-state levels of mRNA 
and those of protein in mammalian cells. 

Like other large-scale analyses, the present study has several 
potential sources of error related to the methods used to de- 
termine mRNA and protein expression levels. The mRNA 
levels were calculated from frequency tables of SAGE data. 
This method is highly quantitative because it is based on actual 
sequencing of unique tags from each gene, and the number of 
times that a lag is represented is proportional to the number of 
mRNA molecules for a specific gene. This method has some 
limitations including the following: (i) the magnitude of the 
error in the measurement of mRNA levels is inversely propor- 
tional to the mRNA levels, (ii) SAGE lags from highly similar 
genes may not be distinguished and therefore are summed, (iii) 
some SAGE lags are from sequences in the 3' untranslated 
region of the transcript, (iv) incomplete cleavage at the SAGE 
tag site by the restriction enzyme can result in two tags repre- 
senting one mRNA, and (v) some transcripts actually do not 
generate a SAGE tag (34, 35). 

For the SAGE method, the error as.sociated with a value 
increases with a decreasing number of transcripts per cell. The 
conclusions drawn from this study are dependent on the qual- 
ity of the mRNA levels from previously published data (35). 
Since more than 65% of the mRNA levels included in this 
study were calculated to !0 copies/cell or less (40% were less 



1728 GYGI ET AL. 



MoL. Cell. Biol. 



G 

o 
U 



0.90- 



0.70 - 



0.50 



0.30 



0.10 - 



-0.10 




Progressively include genes of increasingly higher protein abundance 



40 



50 



60 



70 



80 



90 



100 



110 



Number of genes included 



FIG. 6. Effect of highly abundant proteins on Pearson product moment correlation coefiicient for mRNA and protein abundance in yeast. The set of 106 genes was 
ranked according to protein abundance, and the correlation value was calculated by induding the 40 lowest-abundance genes and then progressively including the 
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than 4 copies/cell), the error associated with these values may 
be quite large. The mRNA levels were calculated from more 
than 20,000 transcripts. Assuming that the estimate of 15,000 
mRNA molecules per cell is correct (16), this would mean that 
mRNA transcripts present at only a single copy per cell would 
be detected 72% of the time (35). The mRNA levels for each 
gene were carefully scrutinized, and only mRNA levels for 
which a high degree of confidence existed were included in the 
correlation value. 

Protein abundance was determined by metabolic radiolabel- 
ing with [^^SJmethionine. The calculation required knowledge 
of three variables: the number of methionines in the mature 
protein, the radioactivity contained in the protein, and the 
specific activity of the radiolabel normalized per methionine. 
The number of methionines per protein was determined from 
the amino acid sequence of the proteins identified by tandem 
mass spectrometry. For some proteins, it was not known 
whether the methionine of the nascent polypeptide was pro- 
cessed away. The N termini of those proteins were predicted 
based on the specificity of methionine aminopeptidasc (31). If 
the N-terminal processing did not conform to the predicted 
specifkit)' of processing enz>'mes, the calculation of the num- 
ber of methionines would be affected. This discrepancy' would 
affect most the quantitation of a protein with a very low num- 
ber of methionines. The average number of calculated methi- 
onines per protein in this study was 7.2. We therefore expect 
the potential for erroneous protein quantitation due to un- 
usual N-tcrminal processing to be small. 



The amount of radioactivity contained in a single spot might 
be the sum of the radioactivity of comigrating proteins. Be- 
cause protein identification was based on tandem mass spec- 
trometric techniques, comigrating proteins could be identified. 
However, comigrating proteins were rarely detected in this 
study^ most likely because relatively small amounts of total 
protein (40 jxg) were initially loaded onto the gels, which re- 
sulted in highly focused spots containing generally 1 to 25 ng of 
protein. Because of the relatively small amount loaded, the 
concentrations of any potentially comigrating protein would 
likely be below the limit of detection of the mass spectrometry 
technique used in this study (1 to 5 ng) and below the limit of 
visualization by silver staining (1 to 5 ng). In the overwhelming 
majority of the samples analyzed, numerous peptides from a 
single protein were detected. It is assumed that any comigrat- 
ing proteins were at levels too low to be detected and that their 
influence in the calculation would be small. 

The specific activit)' of the radiolabel was determined by 
relating the precise amount of protein present in selected spots 
of a parallel gel, as determined by quantitative amino acid 
composition analysis, to the number of methionines present in 
the sequence of those proteins and the radioactivity deter- 
mined by liquid scintillation counting. It is possible that the 
resulting number might be influenced by unavoidable losses 
inherent in the amino acid analysis procedure applied. Because 
four different proteins were utilized in the calculation and the 
experiment was done in duplicate, the specific activity calcu- 
lated is thought to be highly accurate. Indeed, the specific 
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activities calculated for each of the four proteins varied by less 
than 10%. Any inconsistencies in the calculation of the specific 
activity would result in differences in the absolute levels calcu- 
lated but not in the relative numbers and would therefore not 
influence the correlation value determined. 

The protein quantitative method used eliminates a number 
of potential errors inherent in previous methods for the.quan- 
titation of proteins separated by 2DE, such as preferential 
protein staining and bias caused by inequalities in the number 
of radiolabeled residues per protein. Any 2D gel-based method 
of quantitation is complicated by the fact that in some cases the 
translation products of the same mRNA migrated to different 
spots. One major reason is posttranslational modification or 
processing of the protein. Also, artifactual proteolysis during 
cell lysis and sample preparation can lead to multiple resolved 
forms of the protein. In such cases, the protein levels of spots 
coded for by the same mRNA were pooled. In addition, the 
existence of other spots coded for by the same mRNA that 
were not analyzed by ma.ss spectrometr>' or that were below the 
limit of detection for silver staining cannot be ruled out. How- 
ever, since this study is based on a class of highly expressed 
proteins, the presence of undetected minor spots below silver 
staining sensitivity corresponding to a protein analyzed in the 
study would generally cause a relatively small error in protein 
quantitation. 

Codon bias is a measure of the propensity of an organism to 
selectivel>' utilize certain codons which result in the incoipo- 
ration of the .same amino acid residue in a growing polypeptide 
chain. There are 61 po.ssible codons that code for 20 amino 
acids. The larger the codon bias value., the smaller the number 
of codons that arc used to encode the protein (19). ll is 



thought that codon bias is a measure of protein abundance 
because highly expressed proteins generally have large codon 
bias values (3, 13). 

Nearly all of the most highly expressed proteins had codon 
bias values of greater than 0.8. However, we detected a number 
of genes with high codon bias and relative low protein abun- 
dance (Fig. 7). For example, the expressed gene with both the 
second largest protein and mRNA levels in the study was 
EN02^YEAST (775,000 and 289.1 copies/cell, respectively). 
EN01_YEAST was also present in the gel at much lower 
protein and mRNA levels (44,200 and 0.7 copies/cell, respec- 
tively). The codon bias values for EN02 and ENOl are similar 
(0.96 and 0.93, respectively), but the expression of the two 
genes is differentially regulated. Specifically, ENOl^YEAST is 
glucose repressed (6) and was therefore present in low abun- 
dance under the conditions used. Other genes with large codon 
bias values that were not of high protein abundance in the gel 
include EFTl, TIFl, HXK2, GSPl, EGD2, SHM2, and TALI. 
We conclude that merely determining the codon bias of a gene 
is not sufficient to predict its protein expression level. 

Interestingly, codon bias appears to be an excellent indicator 
of the boundaries of current 2D gel proteome analysis tech- 
nology. There arc thousands of genes with expressed mRNA 
and likely expressed protein with codon bias values less than 
0.1 (Fig. 4A). In this study, we detected none of them, and only 
a very small percentage of the genes detected in this study had 
codon bias values between 0.1 and 0.2 (Fig. 4B). Indeed, in 
ever>' examined yeast proteome study (5, 7, 13, 28) where the 
combined total number of identified proteins is 300 to 4(M), this 
.same ob.servation is true. It is expected that for the more 
complex cells of higher eukaryotic organisms the detection of 
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low-abundance proteins would be even more challenging than 
for yeast. This indicates that highly abundant, long-lived pro- 
teins are overwhelmingly detected in proteome studies. If pro- 
teome analysis is to provide truly meaningful information 
about cellular processes, it must be able to penetrate to the 
level of regulatory proteins, including transcription factors and 
protein kinases. A promising approach is the use of narrow- 
range focusing gels with immobilized pH gradients (IPG) (23). 
This would allow for the loading of significantly more protein 
per pH unit covered and also provide increased resolution of 
proteins with similar electrophoretic mobilities. A standard pH 
gradient in an isoelectric focusing gel covers a 7-pH-unit range 
(pH 3 to 10) over 18 cm. A narrow-range focusing gel might 
expand the range to 0.5 pH units over 18 cm or more. This 
could potentially increase by more than 10-fold the number of 
proteins that can be detected. Clearly, current proteome tech- 
nology is incapable of analyzing low-abundance regulatory pro- 
teins without employing an enrichment method for relatively 
low-abundance proteins. In conclusion, this study examined 
the relationship between yeast protein and message levels and 
revealed that transcript levels provide little predictive value 
with respect to the extent of protein expression. 
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If these minor cell proteins differ among cells to the same extent as the 
?more abundant proteins, as is commonly assumed, only a small nmnber of pro- 
differences (perhaps several himdred) suffice to create very large differences 
ij, cell morphology and behavior. 

A Cell Can Change the Expression of Its Genes 
in Response to External Signals ^ 

Most of the specialized cells in a multicellular organism are capable of altering 
tijeir patterns of gene expression in response to extracellular cues. If a liver cell 
is ejq)osed to a glucocorticoid hormone, for example, the production of several 
specific proteins is dramatically increased. Glucocorticoids are released during 
periods of starvation or intense exercise and signal the liver to increase the 
producdon of glucose from amino acids and other small molecules; the set of 
proteins whose production is induced includes enzymes such as tyrosine amino- 
transferase, which helps to convert tyrosine to glucose. When the hormone is ho 
longer present, the production of these proteins drops to its normal level. 

Otiicr cell types respond to glucocorticoids in different ways. In fat ceDs, for 
example, the production of tyrosine aminotransferase is reduced, while some 
other cell types do not respond to glucocorticoids at ail. These examples illustrate 
a general feature of ceD specializatiorv^-different cell types often respond in dif- 
ferent ways to the same extracellular signal. Underlying this specialization are 
features that do not change, which give each cell type its permanentiy distinc- 
tive character. These features reflect the persistent expression of different sets of 
genes. 



Gene Expression Can Be Regulated at Many of the Steps 
in the Pathway from DNA to RNA to Protem ^ 

If differences, between the various cell types of an organism depend on the par- 
ticular genes that the cells express, at what level is the control of gene expression 
exercised? There are many steps in the pathway leading from DNA to protein, and 
aU of them can in principle be regulated. Thus a cell can control the proteins it 
makes by (1) controHing when and how often a given gene is transcribed (tran- 
scriptiDnal control), (2) controlling how the primary RNA transcript is spliced or 
othenvise processed (RNA processing control), (3) selecting which completed 
mRNAs in the cell nucleus are exported to the cytoplasm (RNA transport con- 
trol), (4) selecting which mRNAs in the cytoplasm are translated by ribosomes 
(Janslational control), (5) selectively destabilizing certain mRNA molecules in 
«ie cytoplasm (mRNA degradation control), or (6) selectively activating, inacti- 
vating, or compartmentalizing specific protein molecules after they have been 
made (protein activity control) (Figure 9-2). 

For most genes transcriptional controls are paramount. This makes sense 
^use, of all the possible control points iUustrated in Figure 9-2, only transcrip- 
onal control ensures that no superfluous intermediates are synthesized. In the 




Figure 9-2 Six steps at which 
eucaryote gene expression can be 
controlled. Only controls that operate 
at steps 1 through 5 are discussed in 
this chapter. The regulation of protein 
activity (step 6) is discussed in 
Chapter 5; this includes reversible 
activation or inactivation by protein 
phosphorylation as well as 
irreversible inactivation by proteolytic 
degradation. 
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following sections we discuss the DNA and protein components that regulate the 
initiation of gene transcription. We return at the end of the chapter to the other 
ways of regulating gene expression. 

Summary 

The genome of a cell contains in its DNA sequence the information to make many 
thoTUsands of different protein and RNA molecules. A cell typically expresses only d 
fraction ofits genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed Moreover, cells can change the pattern 
of genes they express in response to charges in their environment, such as signals from 
other cells. Although all of the steps involved in expressing a gene can in principle be 
regulated, for most genes the initiation of RNA transcription is the most important 
point of control 



DNA-binding Motifs in Gene 
Regulatory Proteins ^ 

How does a cell determine which of its thousands of genes to transcribe? As dis- 
cussed in Chapter 8, the transcription of each gene is controlled by a regulatory 
region of DNA near the site wjiere transcription begins. Some regulatory regions 
are simple and act as switches that are thrown by a single signal. Other regula- 
tory regions are complex and act as tiny microprocessors, responding to a vari- 
ety of signals that they interpret and integrate to switch the neighboring gene on 
or off. Whether complex or simple, these switching devices consist of two fun- 
damental types of components: (1) short stretches of DNA of defined sequence 
and (2) gene regulatory proteins that recognize and bind to them. 

We begin our discussion of gene regulatory proteins by describing how these 
proteins were discovered. 

Geiie Regulatory Proteins Were Discovered Using 
Bacterial Genetics ^ 

Genetic analyses in bacteria carried out in the 1950s provided the first evidence 
of the existence of gene regulatory proteins that turn speciiic sets of genes on 
or off. One of these regulators, the lambda repressor, is encoded by a bacterial 
virus, bacteriophage lambda. The repressor shuts off the viral genes that code for 
the protein components of new virus particles and thereby enables the viral ge- 
nome to remain a silent passenger in the bacterial chromosome, multiplying with 
the bacterium when conditions are favorable for bacterial growth (see Figure 
6-80), The lambda repressor was among the first gene regulatory proteins to be 
characterized, and it remains one of the best understood, as we discuss later. 
Other bacterial regulators respond to nutritional conditions by shutting off genes 
encoding specific sets of metabolic enzymes when they are not needed. The lac 
repressor, for example, the first of these bacterial proteins to be recognized, turns 
off the production of the proteins responsible for lactose metabolism when this 
sugar is absent from the medium. 

The first step toward understanding gene regulation was the isolation of 
mutant strains of bacteria and bacteriophage lambda that were unable to shut 
off specific sets of genes. It was proposed at the time, and later proved, that most 
of these mutants were deficient in proteins acting as specific repressors for these 
sets of genes. Because these proteins, like most gene regulatory proteins, are 
present in small quantities, it was difficult and Ume-consuming to isolate them. 
They were eventually purified by fractionating cell exuacts on a series of stan- 
dard chromatography columns (see pp. 166-169). Once isolated, the pro- 
teins were shown to bind to specific DNA sequences close to the genes that they 
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Figure 9-3 Double-helical suvcture 
of DNA. The major and minor grooves 
on the outside of the double helix, 
indicated. The atoms are colored as 
follows: carbon, dark b/we; nitrogen, 
light blue; hydrogen, white; oxygen* 
red; phosphorus, yellow. 
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Summary 

The many types of cells in animals and plants are created largely through mecha- 
nisms that cause different genes to be transcribed in different cells. Since many spe- 
daUzed animal cells can maintain their unique character when groim in culture, the 

regulatory mechanisms involved in creating them must be stable once estab- 
lished and heritable when the cell divides, endowing the cell with a memory of its 
developmental history. Procaryotes and yeasts provide unusually accessible model 
systems in which to study gene regulatory mechanisms, some of which may be rel- 
evant to the creation of specialised cell types in higher eucaryotes. One such mecha- 
nism involves a competitive interaction between two (or more) gene regulatory pro- 
teins, each of which inhibits the synthesis of the other; this can create a flip-flop 
switch that switches a cell between two alternative patterns of gene expression. Di- 
rector indirect positive feedback loops, which enable gene regulatory proteins to 
perpetuate their own synthesis, provide a general mechanism for cell memory. 

In eucaryotes gene transcription is generally controlled by combinations of gene 
regulatory proteins. It is thought that each type of cell in a higher eucaryotic organism 
contains a specific combination of gene regulatory proteins that ensures the expres- 
sion of only those genes appropriate to that type of cell A given gene regulatory pro- 
tein may be expressed in a variety of circumstances and typically is involved in the 
regulation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
condensation are also utilized by eucaryotic cells to regulate gene expression. In ver- 
tebrates DNA methylation also plays a part, mainly as a device to reinforce decisions 
about gene expression that are made initially by other mechanisms. 



PosttJranscriptional Controls 

Although conUols on the initiation of gene transcription are the predominant 
^nn of regulation for most genes, other controls can act later in the pathway 
from WMA to protein to modulate the arnount of gene product that is made. Al- 
though these posttranscrlptionai controls, which operate after RNA polymerase 
as bound to the gene's promoter and begun RNA synthesis, are less common 
^transcriptional control, fox many genes they are crucial. It seems that every 
step in gene expression that could be controlled in principle is likely to he regu- 
3<ed under some circumstances for some genes. 

^ We consider the varieties of posttranscriptionaJ regulation in temporal or- 
according to the sequence of events that might be experienced by an RNA 
o^ecule after its transcription has begun (Figure 9-72). 
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Figure 6-3 Genes can be expressed 
with different effldendes. Gene A i$ 
transcribed and translated much more 
efficiently than gene B.This altows the 
amount of protein A in the celt to be 
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FROM DNATO RNA 

"Dranscription and translation are the means by which cells read out, or express, 
the genetic instructions in their genes. Because many identical RNA copies can 
be made from the same gene, and each RNA molecule can direct the synthesis 
of many identical protein molecules, cells can synthesize a large amount of 
protein rapidly when necessary. But each gene can also be transcribed and 
translated with a different efficiency, allowing the cell to make vast quantities of 
some proteins and tiny quantities of others (Figure 6-3). Moreover, as we see in 
the next chapter, a cell can change (or regulate) the expression of each of its 
genes according to the needs of the moment— most obviously by controlling 
the production of its RNA. 

Portions of DNA Sequence Are Transcribed into RNA 

The first step a cell takes in reading out a needed part of its genetic instructions 
is to copy a particular portion of its DNA nucleotide sequence— a gene— into an 
RNA nucleotide sequence. The information in RNA, although copied into another 
chemical form, is still vmtten in essentially the same language as it is in DNA— 
the language of a nucleotide sequence. Hence the name transcription. 

like DNA, RNA is a linear polymer made of four different types of nucleotide 
subunits linked together by phosphodiester bonds (Figure 6-4). It differs from 
DNA chemically in two' respects: (1) the nucleotides in RNA are 
ribonucleotides— that is, they contain the sugar ribose (hence the name ribonu- 
cleic acid) rather than deojcyribose; (2) although, like DNA, RNA contains the 
bases adenine (A), guanine (G), and cytosine (C), it contains the base macil (U) 
instead of the thymine (T) in DNA. Since U, like T, can base-pair by hydrogen- 
bonding with A (Figure 6-5), the complementary base-pairing properties 
described for DNA in Chapters 4 and 5 apply also to RNA (in RNA, G pairs with 
C, and A pairs with U). It is not imcommon, however, to find other types of base 
pairs in RNA; for example, G pairing with U occasionally. 

Despite these small chemical differences, DNA and RNA differ quite dra- 
matically in overall structure. Whereas DNA always occurs in ceUs as a double- 
stranded helix, RNA is single-stranded. RNA chains therefore fold up into a 
variety of shapes, just as a polypeptide chain folds up to form the final shape of 
a protein (Figure 6-6). As we see later in this chapter, the ability to fold into com- 
plex three-dimensional shapes allows some RNA molecules to have structural 
and catalytic functions. 
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Transcription Produces RNA Complementary to 
One Strand of DNA 

All of the RNA in a cell is made by DNA transcription, a process that has cer- 
tain similarities to the process of DNA replication discussed in Chapter 5. 
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Figure 6-89 Protein aggregates that cause human disease. (A) Schematic illustration of the type of 
conformational change in a protein that produces material for a cross-beta filament. (B) Diagram illustrating 
the self-infectious nature of the protein aggregation that Is central to prion diseases. PrP is highly unusual 
because the misfolded version of the protein^ called PrP*, induces the nbrmal PrP protein it contacts to 
change its confomriation, as shown. Most of the human diseases caused by protein aggregation are caused by 
the overproduction of a variant protein that is especially prone to aggregation, but because this structure is 
not infectious in this wa/» it cannot spread from one aninna] to another. (Q Drawing of a cross-beta fitannent 
a common ^e of protease^sistant protein aggregate found in a variety of human neurological diseases. 
Because the hydrogen-bond interactions In a P sheet form between polypeptide backbone atoms (see Figure 
3-9). a number of different abnormally folded proteins can produce this structure. (D) One of several 
possible models for the conversion of PrP to PrP*. showing the likely change of two a-helices into four 
p-strands. Although the structure of the normal protein has been determined accurately, the structure of the 
infectious form is not yet known with certainty because the aggregation has prevented the use of standard 
structural techniques. (C, courtesy of Louise Serpell, adapted from M. Sunde et al..J. Mot BioL 273:729-739, 
1997; D, adapted from S.B. Prusiner, Trends Biochem. ScL 21:482-^87. 1996.) 

animals and humans. It can be dangerous to eat the tissues of animals that con- 
tain PrP*, as witnessed most recently by the spread of BSE (commonly referred 
to as the "mad cow disease**] from catde to humans in Great Britain. 

Fortunately, in the absence of PrP*, PrP is extraordinarily difficult to convert 
to its abnormal form. Although very few proteins have the potential to misfold 
into an infectious conformation, a similar transformation has been discovered 
to be the cause of an otherwise mysterious "protein-only inheritance" observed 
in yeast cells. 

There Are Many Steps From DNA to Protein 

We have seen so far in this chapter that many different types of chemical reac- 
tions are required to produce a properly folded protein from the information 
contained in a gene (Figure 6-90). The final level of a properly folded protein in 
a cell therefore depends upon the efficiency with vsHhich each of the many steps 
is performed. 

We discuss in Chapter 7 that cells have the ability to change the levels of 
their proteins according to their needs. In principle, any or all of the steps in Fig- 
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Figure 6-90 The production of a 
protein by a eucaryotic celLThe final 
level of each protein in a eucaryotic ceQ 
depends upon the efficiency of each step 
depleted. 



ure &-90) could be regulated by the ceU for each individual protein. However, as 
we shall see in Chapter 7, the initiation of transcription is the most common 
point for a cell to regulate the expression of each of its genes. This makes sense, 
inasmuch as the most efficient way to keep a gene from being expressed is to 
block the very first step— the transcription of its DNA sequence into an RNA 
molecule. 



Summary 

The translation of the nucleotide sequence of an mRNA molecule into protein takes 
place in the cytoplasm on a large ribonucleoprotein assembly called a ribosome. The 
amino acids used for protein synthesis are first attached to a family of tRNA 
molecules, each of which recognizes, by complementary base-pair interactions, par- 
ticular sets of three nucleotides in the mRNA (codons). The sequence of nucleotides in 
the mRNA is then read from one end to the other in sets of three according to the 
genetic code. 

To initiate translation, a small ribosomal subunit binds to the mRNA molecule 
at a start codon (AUG) that is recognized by a unique initiator tRNA molecule, A 
large ribosomal subunit binds to complete the ribosome and begin the elongation 
phase of protein synthesis. During this phase, aminoacyl tRNAs — each bearing a 
specific amino acid bind sequentially to the appropriate codon in mRNA by forming 
complementary base pairs with the tRNA anticodon. Each amino acid is added to the 
C'terminal end of the growing polypeptide by means of a cycle of three sequential 
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Figure 7-5 Six steps at which 
eucaryotic gene expression <m be 
controlled. Controls that operate at 
steps I through 5 are discussed this 
chapter. Step 6, the regulation of protein 
activity, includes reversible activation or 
inactivatlon by protein phosphoryladon* 
(discussed in Chapter 3) as welt as 
irreversible tnacdvation b/ proteolytic 
degradation (discussed in Chapter 6). 



Gene Expression Can Be Regulated at Many of the Steps 
In the Pathway from DNA to RNA to Protein 

If differences among the various ceD types of an organism depend on the partic- 
ular genes that the ceUs express, at what level is the control of gene ejqiressipn 
exercised? As we saw in the last chapter, there are many steps in the paAway 
leading from DNA to protein, and all of them can in principle be regulated. Thus 
a cell can control the protems it makes by (1) controlling when and how often a 
given gene is transcribed (traiiscriptional control), (2) controlling how the RNA 
transcript is spKced or otherwise processed (RNA processing control), (3) 
selecting which completed mRNAs in the cell nucleus are exported to the cytosol 
and determining where in the cytosol they are localized (RNA transport and 
locaUzaUon control), (4) selecting which mRNAs in the cytoplasm are translated 
by ribosomes (translational control), (5) selectively destabilizing certain mRNA 
molecules in the cytoplasm (mRNA degradation control), or (6) selectively acti- 
vating, inactivating, degrading, or compartmentalizing specific protein 
molecules after they have been made (protein activity control) (Figure 7-5). 

For most genes transcriptional controls are paramount. This makes sense 
because, of all the possible control points illustrated in Figure 7-5, only tran- 
scriptional control ensures that the cell will not synthesize superfluous interme- 
diates. In the following sections we discuss the DNA and protein components 
that perform this ftmction by regulating the initiation of gene transcription. We 
shall return at the end of the chapter to the additional ways of reculatine eene 
expression. 

Summary 

m genome of a cell contains in Us DNA sequence the information to make many 
thousands of different protein and RNA molecules. A ceU typically expresses only a 
fraction of its genes, and the different types of cells in multicellular organisms arise 
because different sets of genes are expressed. Moreover, cells can change the pattern 
^ genes they express in response to changes in their environment, such as signals 
from other cells. Although all of the steps involved in expressing a gene can in prin- 
ciple be regulated, Jbr most genes the initiation of RNA transcription is the most 
, important point of control 



DNA-BINDING MOTIFS IN GENE REGULATORY 
PROTEINS 

How does a ceU determine which of its thousands^of genes to transcribe? As 
jnentioned briefly in Chapters 4 and 6, the ttanscription of each gene is con- 
ttoued by a regulatory region of DNA relatively near the site where transcription 
oegms. Some regulatory regions are simple and act as switches that are thrown 
oy a smgle signal. Many others are complex and act as tiny microprocessors, 
responding to a variety of signals that they interpret and integrate to switch the 
neighbormg gene on or off. Whether complex or simple, these switching devices 
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occur in the germ line, the cell lineage that gives rise to spenn or eggs. Most of 
the DNA in vertebrate germ cells is inactive and highly methylated. Over long 
periods of evolutionary time, the methylated CG sequences in these inactive . 
regions have presiunably been lost through spontaneous deamination events 
that were not properly repaired. However promoters of genes that remain active 
in the germ cell lineages (including most housekeeping genes) are kept 
imraeth^dated, and therefore spontaneous deaminations of Cs that occur with- 
in them can be accurately repaired. Such regions are preserved in modem day 
vertebrate cells as CG islands. In addition, any mutation of a CG sequence in die 
genome that destroyed the function or regulation of a gene in the adult would be 
selected against, and some CG islands are simply the result of a higher than nor- 
mal density of critical CG sequences. 

The manmialian genome contains an estimated 20,000 CG islands. Most of 
the Islands mark the 5' ends of transcription units and thus, presumably, of 
genes. The presence of CG islands often provides a convenient way of identify- 
ing genes in the DNA sequerxces of vertebrate genomes. 

Summary 

The many types of cells in animals and plants are created largely throughmecha- 
nisms that cause different g^nes to be transcribed in different ceils. Since many 
specialized animal cells can maintain their unique character through many cell 
division cycles and even when grown in culture, the gene regulatory mechanisms 
involved in creating them must be stable once established and heritable when the 
cell divides. These features endow the cell unth a memory of its developmental history. 
Bacteria and yeasts provide unusually accessible model systems in which to study 
gene regulatory mechanisms. One such mechanism involves a competitive interac- 
tion between two gene regulatory proteins, each of which inhibits the synthesis of the 
other; this can create a flip-flop switch that switches a cell between two alternative 
patterns of gene expression. Direct or indirect positive feedback loops, which enable 
gene regulatory proteins to perpetuate their own synthesis, provide a general mech- 
anism for cell memory. Negative feedback loops with programmed delays form the 
basis for cellular clocks. 

In eucaryotes the transcription ofa gene is generally controlled by combinations 
of gene regulatory proteins. It is thought that each type of cell in a higher eucaryotic 
organism contains a specific combination of gene regulatory proteins that ensures 
the expression of only those genes appropriate to that type of cell A given gene regu- 
latory protein may be active in a variety of circumstances and typically is involved 
in the regulation of many genes. 

In addition to diffusible gene regulatory proteins, inherited states of chromatin 
condensation are also used by eucaryotic cells to regulate gene expression. An espe- 
cially dramatic case is the inactivation of an entire X chromosome in female mam- 
mals. In vertebrates DNA methylation also functions in gene regulation, being used 
mainly as a device to reinforce decisions about gene expression that are made ini- 
tially by other mechanisms. DNA methylation also urulerlies the phenomenon of 
genomic imprinting in mammals, in which the expression ofa gene depends on 
whether it was inherited from the mother or the father. 
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Figure 7-86 A mechanism to explain 
both the marked overall deficiency 
of CG sequences and their clustering 
Into CG islands in vertebrate 
genomes. A bhck tine marks the bcation 
of a CG drnucleotide In the DNA 
sequence, while a red "lollipop" indicates 
the presence of a methyl group on the 
CG diniicleotide.CG sequences that lie In 
regulatory sequences of genes that are 
transcribed in germ cells are unmethylated 
and therefore tend to be retained In 
evolution. Methylated CG sequences, on 
the opher hand, tend to be lost through 
deamination of 5-methyl C to T. unless the 
CG sequence is critical for survival 



POSTTRANSCKIPTIONAL CONTROLS 

In principle, every step required for the process of gene expression could be 
controlled. Indeed, one can find examples of each type of regulation, although 
any one gene is likely to use only a few of them. Controls on the initiation of 
gene transcription are the predominant form of regulation for most genes. But 
other conuols can act later in the pathway from DNA to protein to modulate 
the amount of gene product that is made. Although these pQsttranscriptionaJ 
controls, which operate after RNA polymerase has bound to the gene's promoter 
and begun RNA synthesis, are less common than transcriptional control, for 
many genes they are crucial. 
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CHAPTER 29 

Regulation of transcription 



-rlie plienoU-pic <nncrr»ces thai cMsliiiguish ihe 
.^nHous kinds of rells in a higher eukarvote are 
largely due lo dinerences in ihe expression of 
genes thai code for pniieins. thai is. those Iran- 
* crrthed hy RXA polymerase II. In principle. Ihe 
expression of Ihese t'^^nes nifghl l>e re^ciilaied at 
arty one of several stages. The concept of the 
-level or conlrol" implies that gene t-.xpression 
is not necessarily an automatic process once it 
lias begun, it could be rt-guhnled in a gene- 
specific uay at any one of several sequential 
steps. We can distin-iuish <al least) Hve polen- 
lial control points, ronniiio the series: 

Activation of genr 'slnulure 

i .. . 

InilialiiMi ol' transcnpiion 

I 

Processing the traus<*ri|>i 

i : ' 

Transport, to cviopfasni 
TransJaiicm of niHNA 

Tlie existence iif the Hrst step is implied l)y 
Ihe disc«ver> tiiai genes may exisi in either of 
two structural condilioiis. Kelative to the stale 
nf most of Ihe genome, ^renes are found in 
an "aclive- stifite in Ihe cells in which they 
«re expressed (see Chapter 27). Tiie change of 
Mructure is dislincl from llie act of transcrip- 
tion, and indicates that the gene is nranscrib- 
able." This suggests that acquisition of the 
-active"' structure must be the first step in gene " 
expression. 

Transcription of a gene in the active state is 



controlled a! the stage of initiation, that is. bv 
the interaction of RNA polymerase with its pro- 
moter. This is now becoming .<;usceptible to 
analysis in the /// vitro systems (see Chapter 
28). For most genes, this is a major control 
point: probably it is the most common level of 
regulation. 

There is at pr'esent no e\idence for control 
at subsequent stages of transcription in eukar>-- 
olic cells, for example, via antitermination 
niechanisnis. 

The primary transcript is modified by capping 
at the 5' end.- <ind tisually al.so hy polyadenyla- 
tion at the 3' end. Inirons must he sphced out 
from Ihe lran.«icripts of interrupted genes. The 
mature RNA nuist he exported from, the nucleus 
to Ihe cylO|>lasm. Regulation of gene expression 
by seleclioii of sequences at Ihe level of nuclear 
UNA mighl involve any or all of these stages, 
but the one for which we have most evidence 
concerns changes in splicing: .some genes are 
expre.s.sed by means of alternntlve splicing pal- 
terns wh<i.*ie regulation controls the type of pro- 
lein product (see Chapter 3il). 

Rnally. Ihe translation of an niKNA in the cyto- 
plasm can be .st)ecifically controlled. There is Hllle 
evidence for the employment of this mechanism in 
adult somatic cells, bul it does occur in some 
embryonic siluations, as described in Chapter 
The mechanism is presumed to involve the block- 
ing of initiation of translation of some mRNAs by 
specific protein factors, 

But having acknowledged that control of gene 
expression can occur at multiple stages, and 
that production of RNA cannot inevitably be 
equated with production of protein, it is clear 
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that rhe ovemhelming majority of regulatory 
events occur at^lhe^ itfStiation of transcription. 
Regulation of tissue-specific gene transcription 
lies at the heart of eukar>otic dirferenliation; 
indeed, we see examples in Chapter 38 in 
which proteins that regulate embrv'onic devel- 
opment prove to be transcription factors. A reg- 
ulatory transcription factor serves to provide 



common control of a large number of target 
genes, and we seek to answer two questions 
about this mode of regulation: what identifies 
the common target genes to the transcription 
factor, and how is the activity of the transcrip- 
tion factor itself regulated in response to intrin- 
sic or extrinsic signals? . 



Response elements identify genes under common 
regulation 



The principle that emerges from characterizing 
groups of genes under common control is that 
tho' share a ptvmoter element that is recognized 
by a regulatory transcription facton An element 
that causes ^ gene to- respond to such a factor 
Is called a response element; examples are the 
HSE (heat shock response element), GRE 
(glucocorticoid response element), SRE (serum 
response element). 

The properties of some inducible transcription 
factors and the elements that they recognize are 
summarized in- Table 29,1. Response elements 
have the same general characteristics as 
upstream elements of promoters or enhancers. 
They contain short consensus sequences, and 
copies of the response elements found in dif- 
ferent genes are closely related, but not neces- 
sarily identical. The region bound by the factor 
extends Tw- a short distance on either side of 



Table 29.1 Inducible transcription factors bind to 
response elements thai identify groups ot prcmcflers 
or enhancers subject to coordinate control. 



Regulatory Agent Module Consensus 



Factor 



Heat shock HSE CNNGAANNTCCNN6 HSTF 

Glucocorticoid GRE TGGTACAAATGTTCT Receptor 

Phort)ot ester TRE TGACTCA API 

Serum SRE CCATATTAGG SRF 



the consensus sequence. In promoters^ the ele* 
menls are not present at fixed distances Troni 
the stahpoint, but are usually <200 bp upstream 
of it. The presence of a single element usually 
is sufTicient to confer the regulatory response, 
but sometimes there are multiple copies. 

Response elements may be located in ?^ 
moters or in enhancers. Some types of elements 
are tyipicallv found in one rather than the other 
usually an HSE is found in a promoter, vrbile ' 
GIVE is found in an enhancer. We assiunc ih^* 
all response elements function by the sam^ 
general principle. A gene is regulaud b}' ^ 
sequence at live promoter or enhancer thai t> 
recognized by a specific protein. The protfj^ 
Junctions as a transcription factor needed fi 
RNA poiytnerase to initiate. Active protein ^ 
available only under conditions when the, ^ 
to be expressed; its absence means that ihe 
moter is not activated by this particular 

An example of a . situation in which ^^^^'^ 
genes are controlled by a single factor is 
>ided by the heat shock response. This is ^'^^ 
mon to a wde range of prokaryotes 
eukar>'otes and involves multiple controls 
gene expression: an increase in lemP^^^**^,,, 
turns off transcription of some genes, ^"^^^^^i 
iranscripiion of the heat shock 
causes changes in the translation of ^ j^.* 
The control of the heat shock genes il|»*^ 
the differences between, prokaryotic 
eukaryotic modes of control. In. bacteria. ' ^\ 
Sigma factor is synthesized that ^^^^^ ^jie^ 
polymerase holoenzyme to recognize 
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Abstract 

Background: Prostate stem cell antigen (PSCA) is a recently defined homologue of the Thy-l/Ly-6 izmWy of 
glycosylphosphatidylinositol (GPI)-anchored cell surface antigens. The purpose of the present study was to 
examine the expression status of PSCA protein and mRNA In clinical specinnens of human prostate cancer (Pea) 
and to validate it as a potential molecular target for diagnosis and treatment of Pea. 

Materials and Methods: fmmunohistochemical (IHC) and in situ hybridization (ISH) analyses of PSCA 
expression were simultaneously performed on paraffm-embedded sections from 20 benign prosuUc hyperplasia 
(BPH). 20 prostatic intraepithelial neoplasm (PIN) and 48 prostate cancer (Pea) tissues, including 9 androgen- 
Independent prostate cancers. The level of PSCA expression was semiquantitath^ely scored by assessing both the 
percentage and intensity of PSCA-positive staining cells in the specimens. Then compared PSCA expression 
between BPH, PIN and Pea tissues and analysed the correlations of PSCA expression level with pathological grade, 
clinical stage and progression to androgen-independence in Pea. 

Results: In BPH and low grade PIN, PSCA protein and mRNA staining were weak or negative and less intense 
and uniform than that seen in HGPIN and Pea. There were moderate to strong PSCA protein and mRNA 
expression In 8 of 1 1 (72.7%) HGPIN and in 40 of 48 (83.4%) Pea specimens examined by IHC and ISH analyses, 
with sutistical significance compared with BPH (20%) and low grade PIN (22.2%) samples (p < 0.05. respectively)! 
The expression level of PSCA increased with high Gleason grade, advanced stage and progression to androgen- 
independence (p < O.OS, respectively). In addition, IHC and ISH staining showed a high degree of correlation 
between PSCA protein and mRNA overexpression. 

Conclusions; Our data demonstrate that PSCA as a new cell surface marker is overexpressed by a majority of 
human Pea. PSCA expression correlates positively with adverse tumor eharacieristies, such as increasing 
pathological grade (poor cell differentiation), worsening clinical suge and androgen-independence. and 
speculatively with prostate carcinogenesis. PSCA protein overexpression results from upregulated transcription 
of PSCA mRNA. PSCA may have prognostic utility and may be a promising molecular target for diagnosis and 
treatment of Pea. 



Page 1 of 7 
(page number not for ataUon purposes} 



Wofid Journal of Surgical Oncology 2004. 2 



htlp://www.wjso.com/contenV2/1 /1 3 



Introduction 

Prostate cancer (Pea) is the second leading cause of can- 
cer-relaled death in American men and is becoming a 
common cancer inaeasing in China. Despite recently 
great progress in the diagnosis and management of local- 
ized disease, there continues to be a need for new diagnos- 
tic markers that can accurately discriminate between 
indolent and aggressive variants of Pea. There also contin- 
ues to be a need for the identification and charaaerization 
of potential new therapeutic targets on Pea cells. Current 
diagnostic and therapeutic modalities for recurrent and 
metastatic Pea have been limited by a lade of specific tar- 
get antigens of Pea. 

Although a number of prostate-specific genes have been 
identified (i.e. prostate specific antigen, prostatic acid 
phosphatase, glandular kallikrein 2), the majority of these 
are secreted proteins not ideally suited for many immuno- 
logical strategies. So, the identification of new cell surface 
antigens is critical to the development of new diagnostic 
and therapeutic approaches to the management of Pea. 

Reiter R£ et al {!) reported the identification of prostate 
stem cell antigen (PSCA), a cell surface antigen that is pre- 
dominantly prostate specific. The PSCA gene encodes a 
123 amino acid glycoprotein, with 30% homology to 
stem cell antigen 2 (Sea 2). Like Sca-2, PSCA also belongs 
to a member of theThy-l/Ly-6 family and is anchored by 
a glycosylphosphatidylinosilol (GPI) linkage. mRNA in 
situ hybridization (ISH) localized PSCA expression in nor- 
mal prostate to the basal cell epithelium, the putative 
stem cell companment of prostatic epithelium, suggesting 
that PSCA may be a marker of prostate stem/progenitor 
cells. 

In order to examine the status of PSCA protein and mRNA 
expression in human Pea and validate it as a potential 
diagnostic and therapeutic target for Pea, we used immu- 
nohistochemistry (IHC) and in situ hybridization (ISH) 
simultaneously, and conducted PSCA protein and mRNA 
expression analyses in paraffin-embedded tissue speci- 
mens of benign prostatic hyperplasia (BPH, n = 20), pros- 
tate intraepithelial neoplasm (PIN, n = 20) and prostate 
cancer (Pea, n = 48). Furthermore, we evaluated tlie possi- 
ble correlation of PSCA expression level with Pea tumori- 
genesis, grade, stage and progression to androgen- 
independenee. 

Materials and methods 

Tissue samples 

All of the clinical tissue specimens studied herein were 
obtained from 80 patients of 57-84 years old by prostate- 
ctomy, transurethral resection of prostate (TURP) or biop- 
sies. The patients were classified as 20 cases of BPH, 20 
cases of PIN, 40 cases of primary Pea, including 9 patients 



with recurrent Pea and a history of androgen ablation 
therapy (orchiectomy and/or hormonal therapy), who 
were referred to as androgen-independent prostate can- 
cers. Eight specimens were harvested from these andro- 
gen-independent Pea patients prior to androgen ablation 
treatment. Each tissue sample was cut into two parts, one 
was fixed in 10% formalin for IHC and the other treated 
with 4% paraformaidehyde/0.1 M PBS PH 7.4 in 0.1% 
DEPC for 1 h for ISH analysis, and then embedded in par- 
affin. All paraffin blocks examined were then cut into 5 
\im sections and mounted on the glass slides specific for 
IHC and ISH rcspeaively in the usual feshion. H&E- 
stalned section of each Pea was evaluated and assigned a 
Cleason score by the experienced urological pathologist at 
our institution based on the criteria of Cleason score |2). 
The Cleason sums are summarized in Table 1. Clinical 
staging was performed according to lewett-whitmore- 
proui staging system, as shown in Table 2. In the category 
of PIN, we graded the specimens into two groups, i.e. low 
grade PIN (grade I - 11) and high grade PIN (HGPIN, 
grade 111) on the basis of literatures |3,4). 

Immunohistochemicat (IHC) analysis 
Briefly, tissue sections were deparaffinized, dehydrated, 
and subjected to microwaving in 10 mmol/L citrate 
buffer, PH 6.0 (Boshide, Wuhan, China) in a 900 W oven 
for 5 min to induce epitope retrieval. Slides were allowed 
to cool at room temperature for 30 min. A primary mouse 
antibody specific to human PSCA (Boshide, Wuhan, 
China) with a 1:100 dilution was applied to incubate with 
the slides at room temperature for 2 h. Labeling was 
detected by sequentially adding biotinylated secondary 
antibodies and strepavidin-peroxidase, and localized 
using 3,3*-diaminobenzidine reaction. Sections were then 
counterstained with hematoxylin. Substitution of the pri- 
mary antibody with phosphate-buffered-saline (PBS) 
served as a negative-staining control. 

mRNA in situ hybridization (ISH) 

Five-pm-thick tissue seaions were deparaffinized and 
dehydrated, then digested in pepsin solution (4 mg/ml in 
3% citric acid) for 20 min at 37.5'C, and further proc- 
essed for ISH. Digoxigenin-labeled sense and antisense 
human PSCA RNA probes (obtained from Boshide, 
Wuhan, China) were hybridized to the sections at 48''C 
overnight. The posthybridization wash with a high strin- 
gency was performed sequentially at 37* C in 2 « standard 
saline citrate (SSC) for 10 min, in 0.5 x SSC for 15 min 
and in 0.2 x SSC for 30 min. The slides were then incu- 
bated to biotinylated mouse anii-digoxigenin antibody ai 
37.5*C for 1 h followed by washing in 1 ^ PBS for 20 min 
ai room temperature, and then to sirepavidin-peroxidase 
at 37.5*C for 20 min followed by washing in 1 x PBS for 
15 min at room temperature. Subsequently, the slides 
were developed with diaminobenzidine and then coun- 
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Table I: Correlation ofPSCA expression with Gleason score 
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Table 1: Correlation of PSCA expression witli clinical stage 
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terstained with hematoxylin to localize the hybridization 
signals. Sections hybridized with the sense control probes 
routinely did not show any specific hybridization signal 
above background. All slides were hybridized with PBS to 
substitute for the probes as a negative control. 

Scoring methods 

To determine the correlation between the results of PSCA 
immunostaining and mRNA in situ hybridization, the 
same scoring manners are taken in the present study for 
PSCA protein staining by IHC and PSCA mRNA staining 
by ISH. Each slide was read and scored by two independ- 
ently experienced urological pathologists using Olympus 
BX-41 light microscopes. The evaluauon was done in a 
blinded fashion. For each section, five areas of similar 
grade were analyzed semiquantitatively for the fraction of 
cells staining. Fifty percent of specimens were randomly 
chosen and rescored to determine the degree of interob- 
server and iniraobserver concordance. There was greater 
than 95% intra- and inierobserver agreement. 

The intensity of PSCA expression evaluated microscopi- 
cally was graded on a scale of 0 to 3+ with 3 being the 
highest expression observed (0, no staining; mildly 
intense; 2+, moderately intense; 3+, severely intense). The 
staining density was quantified as the percentage of cells 
staining positive for PSCA with the primary antibody or 
hybridization probe, as follows: 0 = no staining; 1 = posi- 
tive staining in <25% of the sample; 2 = positive staining 
in 25%-50% of the sample; 3 = positive staining in >50% 



of the sample. Intensity score (0 to 3+) was multiplied by 
the density score (0-3) to give an overall score of 0-9 
[1,5). In this way, we were able to differentiate specimens 
that may have had focal areas of increased staining from 
those that had diffuse areas of increased staining |6|. The 
overall score for each specimen v;as then categorically 
assigned to one of the following groups: 0 score, negative 
expression; 1-2 scores, weak expression; 3-6 scores, mod- 
erate expression; 9 score, strong expression. 

Statistical analysis 

Intensity and density of PSCA protein and mRNA expres- 
sion in BPH, PIN and Pea tissues were compared using the 
Chi-square and Student's t-test. Univariate associations 
between PSCA expression and Cleason score, clinical 
stage and progression to androgen-independence were 
calculated using Fisher's Exact Test. For all analyses, p < 
0.05 was considered statistically significant. 

Results 

PSCA expressfon in BPH 

In general, PSCA protein and mRNA were expressed 
weakly in individual samples of BPH. Some areas of 
prostate expressed weak levels (composite score 1-2), 
whereas other areas were completely negative (composite 
score 0).'Four cases (20%) of BPH had moderate expres- 
sion of PSCA protein and mRNA (composite score 4-6) 
by IHC and ISH. In 2/20 (10%) BPH specimens. PSCA 
mRNA expression was moderate (composite score 3-6), 
but PSCA protein expression was weak (composite score 
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2) in one and negative (composite score 0) in the other. 
PSCA expression was localized to the basal and secretory 
epithelial cells, and prostatic stroma was almost negative 
staining for PSCA protein and itiRNA in ail cases 
examined. 

PSCA express/on In PIN 

In this study, we detected weak or negative expression of 
PSCA protein and mRNA (^2 scores) in 7 of 9 (77.8%) 
low grade PIN and in 2 of 11 (18.2%) HGPIN, and mod- 
erate expression (3-6 scores) in the rest 2 low grade PIN 
and 5 of 1 1 (45.5%) HGPIN. One HGPIN with moderate 
PSCA mRNA expression (6 score) was found weak stain- 
ing for PSCA protein (2 score) by IHC. Suong PSCA pro- 
tein and mRNA expression (5) score) were detected in die 
remaining 3 of 11 (27.3%) HGPIN. There was a statisti- 
cally significant difference of PSCA protein and mRNA 
expression levels observed between HGPIN and BPH (p < 
0.05). but no statistical difference reached between low 
grade PIN and BPH (p > 0.05), 

PSCA expression in Pea 

In order to determine if PSCA protein and mRNA can be 
detected in prostate cancers and if PSCA expression levels 
are increased in malignant compared with benign glands. 
Forty-eight paraffin-embedded Pea specimens were ana- 
lysed by IHC and ISH. It was shown that 19 of 48 (39.6%) 
Pea samples stained very strongly for PSCA protein and 
mRNA with a score of 9 and another 21 (43.8%) speci- 
mens displayed moderate staining with scores of 4-6 (Fig- 
ure 1). In addition, 4 specimens with moderate to strong 
PSCA mPJvIA expression (scores of 4-9) had weak protein 
staining (a score of 2) by IHC analyses. Overall, Pea 
expressed a significantly higher level of PSCA protein and 
mRNA than any other specimen caiegor>' in this study (p 
< 0.05, compared with BPH and PIN respectively). The 
result demonstrates that PSCA protein and mRNA are 
overcxpressed by a majority of human Pea. 

Correlation of PSCA expression with Gteason score in Pea 
Using the semi-quantitative scoring method as described 
in Materials and Methods, we compared the expression 
level of PSCA protein and mRNA with Gleason grade of 
Pea, as shown in Table 1. Prostate adenocarcinomas were 
graded by Gleason score as 2-4 scores = well-differentia- 
tion, 5-7 scores = moderate-differentiation and 8-10 
scores = poor-differentiation |7|. Seventy-two percent of 
Gleason scores 8-10 prostate cancers had very strong 
staining of PSCA compared to 21% with Gleason scores 
5-7 and 17% with 2-4 respectively, demonstrating that 
poorly differentiated Pea had significantly stronger 
expression of PSCA protein and mRNA than moderately 
and well differentiated tumors (p < 0.05). As depicted in 
Figure 1, IHC and ISH analyses showed that PSCA protein 
and mRNA expression in several cases of poorly differen- 



tiaied Pea were particularly prominent, with more intense 
and uniform staining. Tlie results indicate that PSCA 
expression increases significantly with higher tumor grade 
in human Pea. 

Correlation of PSCA expression with cilnlcal stage in Pea 
With regards to PSCA expression in every stage of Pea, we 
showed the results in Table 2. Seventy-five percent of 
locally advanced and node positive cancers (i.e. C-D 
stages) expressed statistically high levels of PSCA versus 
32.5% that were organ confined (i.e. A-B stages) (p < 
0.05). The data demonstrate that PSCA expression 
increases significantly with advanced tumor stage in 
human Pea. 

Correlation of PSCA expression with androgen* 
independent progression of Pea 

Ail 9 specimens of androgen-independent prostate can- 
cels stained positive for PSCA protein and mRNA. Eight 
specimens were obtained from patients managed prior to 
androgen ablation therapy. Seven of eight (87.5%) of 
these androgen-independent prostate cancers were in the 
strongest staining category (score = 9), compared with 
three out of eight (37.5%) of patients with androgen- 
dependent cancers (p < 0.05). The results demonstrate 
that PSCA expression increases significantly wiUi progres- 
sion to androgen-independence of human Pea. 

It is evident from the results above that within a majority 
of human prostate cancers the level of PSCA protein and 
mRNA expression correlates significantly with increasing 
grade, worsening stage and progression to androgen-inde- 
pendence. 

Correlation of PSCA Immunostaining and mRNA m situ 
hybridization 

In all 88 specimens surveyed herein, we compared the 
results of PSCA IHC staining with mRNA ISH analysis. 
Positive staining areas and its intensity and density scores 
evaluated by IHC were identical to those seen by ISH in 79 
of 88 (89.8%) specimens (18/20 BPH, 19/20 PIN and 42/ 
48 Pea respectively). Importantly, 27/27 samples with 
PSCA mRNA composite scores of 0--2. 32/36 samples 
with scores of 3-6 and 22/24 samples with a score of 9 
also had PSCA protein expression scores of 0-2, 3-6 and 
9 respectively. However, in 5 samples with PSCA mRNA 
overall scores of 3-6 and in 2 with scores of 9 there were 
less or negative PSCA protein expression (i.e. scores of 0- 
4), suggesting that this may reflect posttranscriptional 
modification of PSCA or that the epitopes recognized by 
PSCA mAb may be obscured in some cancers. The data 
demonstrate that the results of PSCA immunostaining 
were consistent with those of mRNA ISH analysis, show- 
ing a high degree of correlation between PSCA protein 
and mRNA expression. 
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Figure t 

Kepresentatives of PSCA IHC and ISH staining in Pea (A. IHC staining, B. ISH staining, x200 magnification). A,, B 1.' negative con- 
trol of IHC and ISH. PBS replacing the primary antibody (A,) and hybridization with a sense PSCA probe (B,) showed no back- 
ground staining. Aj, B^: a moderately differentiated Pea (Gleason score = 3+3 = 6) with moderate staining (composite score = 
6) in all malignant cells; Aj: IHC shows not only cell surface but also apparent cytoplasmic staining of PSCA protein. Aj, B^: a 
poorly differentiated Pea (Gleason score = 4+4 = 8) with very strong staining (composite score = 9) in all malignant celts. 
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Discussion 

PSCA is homologous to a group of cell surface proteins 
that mark the earliest phase of hematopoietic develop- 
ment. PSCA mRNA expression is prostate-specific in nor- 
mal male tissues and is highly up-regulated in both 
androgen-dependent and-independent Pea xenografts 
(LAPC-4 tumors). Wc hypothesize that PSCA may play a 
role in Pea tumorigenesis and progression, and may serve 
as a target for Pea diagnosis and treatment. In this study, 
IHC and ISH showed that in general there were weak or 
absent PSCA protein and mRNA expression in BPH and 
low grade PIN tissues. However, PSCA protein and mRNA 
are widely expressed in HGPIN, the putatwe precursor of 
invasive Pea, suggesting that up-regulation of PSCA is an 
early event in prostate carcinogenesis. Recently, Reiter RE 
elal [1 J, using ISH analysis, reported that 97 of 1 18 (82%) 
HGPIN specimens stained suongly positive for PSCA 
mRNA. A very similar fmding was seen on mouse PSCA 
(niPSCA) expression in mouse HGPIN tissues by Tran C. 
P ct al [8]. These data suggest that PSCA may be a new 
marker associated with transformation of prostate cells 
and tumorigenesis. 

We observed that PSCA protein and mRNA are highly 
expressed in a large percentage of hiiman prostate cancers, 
including advanced, poorly differentiated, androgen- 
independent and rnetastatic cases. Fluorescence-activated 
cell sorting and confocal/ immimofluorescent studies 
demonstrated cell surface expression of PSCA protein in 
Pea cells [9]. Our IHC expression analysis of PSCA shows 
not only cell surface but also apparent cytoplasmic stain- 
ing of PSCA protein in Pea specimens (Figure 1 ). One pos- 
sible explanation for this is that anti-PSCA antibody can 
recognize PSCA peptide precursors that reside in the cyto- 
plasm. Also, it is possible that the positive staining that 
appears in the cytoplasm is aaually from the overlying 
cell membrane [5). These data seem to indicate that PSCA 
is a novel cell surface marker for human Pea. 

Our results show that elevated level of PSCA expression 
correlates with high grade {i.e. poor differentiation), 
increased tumor stage and progression to androgen-inde- 
pendence of Pea. These findings support the original IHC 
analyses by Gu 2 et al (9], who reported that PSCA protein 
expressed in 94% of primary Pea and the intensity of 
PSCA protein expression increased with tumor grade, 
stage and progression to androgen-independence. Our 
results also collaborate the recent work of Han KR ei al 
1 10], in which the significant association between high 
PSCA expression and adverse prognostic features such as 
high Gleason score, seminal vesicle invasion and capsular 
involvement in Pea was found. It is suggested iliai PSCA 
overexpression may be an adverse predictor for recur- 
rence, clinical progression or survival of Pea. Hara H ct al 
111] used RT-PCR detection of PSA, PSMA and PSCA in 1 



ml of peripheral blood to evaluate Pea patients with poor 
prognosis. The results showed that among 58 PCa 
patients, each PGR indicated the prognostic value in the 
hierarchy of PSCA>PSA>PSMA RT-PCR, and cxtraproslatic 
cases with positive PSCA PGR indicated lower disease-pro- 
gression-free survival than those with negative PSCA PGR, 
demonstrating that PSCA can be used as a prognostic fac- 
tor. Dubey P et al 112) reported that elevated numbers of 
PSCA + cells correlate positively with the onset and devel- 
opment of prostate carcinoma over a long time span in 
the prostates of the TRAMP and PTEN +/- models com- 
pared with its normal prostates. Taken together with our 
present findings, in which PSCA is overexpressed from 
HGPIN to almost frank carcinoma, it is reasonable and 
possible to use increased PSCA expression level or 
increased numbers of PSCA-positive cells in the prostate 
samples as a prognostic marker to predict the potential 
onset of this cancer. These data raise the possibility that 
PSCA may have diagnostic utility or clinical prognostic 
value in human Pea. 

The cause of PSCA overexpression in Pea is not known. 
One possible mechanism is that it may result from PSCA 
gene amplification. In humans, PSCA is located on chro- 
mosome 8q24.2 (1], which is often amplified in meta- 
static and recurrent Pea and considered to indicate a poor 
prognosis [13-15]. Interestingly, PSCA is in close proxim- 
ity to the c-myc oncogene, which is amplified in >20% of 
recurrent and metastatic prostate cancers [16,17]. Reiter 
RE et al 1 1 8| reported that PSCA and MYC gene copy num- 
bers were co-amplified in 25% of tumors (five out of 
twenty), demonstrating that PSCA overexpression is asso- 
ciated with PSCA and MYC coamplificalion in Pea. Gu Z 
et al (91 recently reporteted that in 102 specimens availa- 
ble to compare the results of PSCA immunosiaining with 
their previous mRNA ISH analysis, 92 (90.2%) had iden- 
tically positive areas of PSCA protein and mRNA expres- 
sion. Taken together with our findings, in which we 
detected moderate to strong expression of PSCA protein 
and mRNA in 34 of 40 (85%) Pea specimens examined 
simultaneously by IHC and ISH analyses, it is demon- 
strated that PSCA protein and mRNA overexpressed in 
human Pea, and that the increased protein level of PSCA 
was resulted from the upregulaied transcription of its 
mRNA. 

At present, die regulation mechanisms of human PSCA 
expression and its biological ftinction are yet to be eluci- 
dated. PSCA expression may be regulated by multiple fac- 
tors [18). WatabeTet al [19] reported that transcriptional 
control is a major component regulating PSCA expression 
levels. In addition, induction of PSCA expression may be 
regulated or mediated through cell-cell contact and pro- 
tein kinase C (PKC) |20]. Homologues of PSCA have 
diverse activities, and have themselves been involved in 
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carcinogenesis. Signalling through SCA-2 has been dem- 
onstrated to prevent apoptosis in immature thymocytes 
[21]. Thy-1 is involved in T eel! activation and uansducts 
signals through src-like tyrosine kinases [22j, Ly-6 genes 
have been implicated both in tumorigenesis and in cell- 
cell adhesion (23-25). Cell-cell or cell-matrix interaction is 
critical for local tumor growth and spread to distal sites. 
From its restricted expression in basal cells of normal 
prostate and its homology to SCA-2, PSCA may play a role 
in stem/progenitor cell function, such as self-renewal (i.e. 
anti-apoptosis) and/or proliferation [1]. Taken together 
with the results in the present study, we speculate that 
PSCA may play a role in tumorigenesis and clinical pro- 
gression of Pea through affecting cell transformation and 
proliferation. From our results, it is also suggested that 
PSCA as a new cell surface antigen may have a number of 
potential uses in the diagnosis, therapy and clinical prog- 
nosis of human Pea. PSCA overexpression in prostate 
biopsies could be used to identify patients at high risk to 
develop recurrent or metastatic disease, and to discrimi- 
nate cancers from normal glands in prostatectomy sam- 
ples. Similarly, the detection of PSCA-overexpressing cells 
in bone marrow or peripheral blood may identify and pre- 
dict metastatic progression belter than current assays, 
which identify only PSA-positive or PSMA-positive pros- 
tate cells. 

In summary, we have shown in this study that PSCA pro- 
tein and mRNA are maintained in expression from 
HGPIN through all stages of Pea in a majority of cases, 
which may be associated with prostate carcinogenesis and 
correlate positively with high tumor grade (poor cell dif- 
ferentiation), advanced stage and androgen-independeni 
progression. PSCA protein overexpression is due to the 
up regulation of its mRNA transcription. The results sug- 
gest that PSCA may be a promising molecular marker for 
the clinical prognosis of human Pea and a valuable target 
for diagnosis and therapy of this tu mor. 
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Abstract 

Translation faiHIatf on is regulated bi response to 
mitrient 4vaIIid>imy and mttogerte stimulation and Is 
coupM vMf ceo cyde progies^n and ceR gmwttk 
Several aReraUons bi translatipnal control occur In 
cancer. Variant mlUIA sequences can alter the 
transMonal elfl<^ency of Incfivldual mRI^ 
which In turn play a role In cancer biology. Changes In 
the expres^n or avs^blllty of components of the 
translattonal madUneiy and in the activation of 
translation through slgn^ transduction pathways can 
toad to more global ^langes, such as an Increase kt 
the oveiifl rate of protein synthesis and translation^ 
acthratlon of the mRNA molecules Involved In cell 
growth and proliferation. We iei4ew the basic 
principles of translational control, the alterations 
encountered hi pancer, ar\d selected therapies 
targeting banslatlon Initiation to help elucidate new 
therapeutic avehues- 

faitroductlon 

The fundamental prindprfe of molecular therapeutics In can- 
cer Is to exploit the differences In gene expression between 
cancer cells and nonnal celte. VWth the advent of cDNA array 
te^notogy, most efforts have concentmted on Identtfylng 
(flfferences In gene expression at the level of mRNA, which 
can be attributable either to DMA amplification or to differ- 
ences In transcription. Gene expressten Is quite compnc^ed, 
however, and Is also regulated at the level of mRNA slaWlltv 
mRNA translation, and protein staWlltyr. 

The power of translational regulation has been best recog- 
nfead among developmental biologists, because transcription 
does not occur In early embryogenesfe In eukaryotes. For ex- 
ample, In )tenopus, the period of transcriptional qutescenoe 
conttnues untfl the embryo reaches mktolastula transition, the 
4000^1 staga Therefore, ail necessaiy mRNA mdecutes are 
fcBnscrfbed during oogenesis and stockpDed in a translaHonally 
Inactive, masked form. The mRNA are translaHonally activated 
at appropriate t^es during oocyte maturation, fertilbation, and 



eerV embtyogeneslB and thus; are under sWct fransWo^ 
control. 

TransIalJon has an established role In cell gnwth. Basf- 
cally, an Increase In jwoteln synthesis occurs as a cons©- 
quence of nrrftogenesis. UntB recently, however, mtte was 
known about the alterations In mRNA translaUon In cancer 
and much Is yet to be jdiscovered about their role bi the 
devetopment and progression of cancer. Here we review the 
ba ste prfri ciptes of tran8lalk)nal contrDl the aiterattons 

countered h cancer, and selected iheraplos targetbig bansia- 
tlon Wllalk>n to efcjddalo potential new thera^^ 

Basic Principles of Translatf onal Control 
Mechanism of Translamn Inmtkm 

Trmtete*k)n InWalton is the main ^ 
Translalfon Ir^Iation is a complex pioce^ h whidi the Initial 
tRNA and the 40S and 60S rft»somal subunlts are recmiled to 
the 5' ^ of a mRNA molecule and assembled tjy eiteryotic 
tran^tfon inltlallon fadors bito an 80S ribosoma ^ 
codon of me nriRNA (Rg. IX The 5' end of eukaryotfc mRNA Is 
capped, £a, contains the cap sbucture m^GpppN (7-methy|. 
guanoslne-triphospho5'Hib6nudeosIde). iVtost banslation in 
eukaryotes occurs bi a capKlependentfashton, Ae, tite cap fe 
specflk^ recognised ty the elF4E,^ whfch binds the 5' cap 
The elF4F translatfon Initiation complex Is then formed by thb 
assembly of eIF4E, the RNA heBcase elFM, and elF4Q, a 
scaffolding protein that mediates the bWfrig of the 40S ribc>. 
somal subunll to the mRNA molecule ttvou^ brteractJon with 
the elF3 protein present on the 40S n-bosoma elF4A and elF4 B 

participale in mefting the secondary sbucture of the 5' UTR of 
the mRNA. The 43S inJtlatkHi complex (40aABlF2/M^-tRIW 
GTP complex) scans the mRNA in a 5'-^' dfreclkw untS it 
encounters an AUG start codon. This start codon Is then base- 
paired to the antteodon of Initiator tRNA, fbnning the 48S bim- 
attoncomptex. The Inltlalton factors are then displaced bom the 
48S complex, and the 60S ribosome joins to fbmi the 80S 
ribosome. 

Unlike most eukayotic banslaUon, banslatton InWalton of 
certain mRNAs, such as the picomavinjs RNA, Is cap Inde^ 
pendent and occurs by Internal ribosome enby. This mecha- 
nism does not requlTB elF4E Other the 43S cwnplax can bind 
the initiation codon cftectly through Weraction virtlh the IRES In 
the 5' UTR such as in the encephatomyocanditis vfrus, or It can 
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u^SL^S^^ ^ ?f eotoryoflc biftiatfon fector 4E; um. 
Wtteltonfectof4E4*i^ 

teted^ehrorw»omoio^ 

i^^S^SiS^^^ elcosopentaenofc 




Unwind^ of seoombiy stiocbire 




Scannii^ 
relea$9 of initiation 
SOS foimaiioii 



F^. T. TfmtBtion inmatk>n In eukaryotds Tha 4upo i. 

Pjoiyfalod to release e!F4E so that ftStoart ^ hyperphos^ 

b found to the 



scanrtng or transfer, as Is the case virflh the ponortoT!^ 

Regubmn of Translation IniHatton 

IZJ^n be regulated by aheratJons In the 

iZ^ T ^^'^ status Of the various factors 

involved. Key components in tianstalional regulation 

may provide potenttal therapeutic taigets fbllT 

B SZ', 8 role in translation regulation 

sWefBd the rate-nmlting componertf for mitlatton of 

^ .nvolved in 
aplWng. nriRNA 3' processing, and mFtNA nucteocytooias. 

^scrtgiona, level In .Bsponse to 8«um ol^^fa^S^ 
2 m^ff °^^:«»'°" ""ay cause preferential translation 
5 UTR that are normally discriminated against by the trare^ 



taflonal machheiy and thus are InefBctently translaled (4-7* 

^s^trans^on Of vascular endothelial bSTSJ^ 
fl«>«*last growth f8ctor-2. and cycBnm * 

/^Jrther mechanlsrn of control b the reoub^ 
Ph^honrtaltoa elF4E Phosphorylation bZS?tSft2 

!l.??f^^*^'"'^^-««^^ pathway acS 
e^acellular signal-related. Wnasea and thTstrtiSSS 

""^"^l!!^!^ ^ nillogen-actlvated wotcfcTS 

derived growth factor, epktonna) growth lactc^. 
««totenstn II, sre Wnase overexpr^on. «S teW^ 
• aipresdon. lead to elF4E phosphoiylallon (14) The oho*. 

status Of elF4E Is 3^ ^l^iJTS; 
iranslatlonal rata and growth status of the o^howL»^ 
e.F4E ph^phorylaBon has also 

5- "^^^ needed to undSS 
1!^^ phosphorylation on elF4E activity 

Awfter mechanism of regulation Is the alteration of elF4e 
S^Si^'i^'J:!"^ Of elF4Eto the elF^^g 
tetas (4E-BP. also known as PHAS-^. 4E-BP8 compete with 

^forabIndlngsltelnelF4E.ThebIndlr?c52SI^^ 

BW tJn^t^lS*:^"*^ HypophosphqryUdl^ 
*t^;:2!!^^^^'*^ hyperphosphorylatten 
dec^SM this binding. Insulin, angiotensin, epWennal 
jowtt, factor. Platelet-derived growth factor,' he?at^ 
KS??h' "^«0«^fe«or. Insulln-llkegrowthfactire 
1^1. InterieuMn 3. granulocyte-macrophage colony^ • 

B^"- -ndtheadenovlrite^ 
all been reported to Induce phosphorytaUon of 4E-BP1 and 
to decrease the ability of 4E-BP1 to bind elF4E (15. 1® 

b>4E-BPl dephosphorylatlon. an frictoaso In elF4E btoidlna 
and a decrease In cap^ependent translation """^ 
J^S!^^ P»»sphorylatlonofnb08omal40Sp,oteIn 

'"ouseeinbryonfc cells proilferalemor* 
slowly than do parental cells, demonstrating that S6K hasa 

^ttve Influence on cenprollferMlon (17). S6K,egulales the 
to^tori of a group of mRNAs possessing a 5' tennlnal 
«^^^t,«(5'TOF,fou„datthe5'U?R 

«»fcanslaBoiiffll machheor. PhosphoryiaHon of SBK Is regu- 

as plalelet-derlvBd 
growth factor and InsuMke growth faotor I 
elF2a Pho^rfwylatlon. The binding of the Initiator tRNA 

lTf^£^ '"""^ by translation Initb- 
Phosphorylation of the ^subunH of elF2 

"*itote global protein synthesis (21. 22). elF2a Is phospho. 
ryla^ed under a variety of conditions, such as viral InfeSuon 
m^t deprtvatlon. heme deprivation, and apoptosis (22)' 
^LtJJ^Sl!^ *^ heme^egulated Inhibitor. nutrtS" 
regulated protein Wnase. and the IFN-lnduoed. double- 
stranded RNA-activated protein kinase (PKR; Ref. 23). 



fSJ^^^^^^J*^ of Intensive study because It lr»- 

fe^Jl^?r^^"^'*^™"*°^*ese pathways 
bmTXM (abo caned FRAP or RAm). TOTDR to the man^ 
mallanhomologuo of theyeastTW protetethrt reaulateQ, 

i^r^ mTOR Is a flerine^hreonlne idnase that moduli 

ftWBbflonlnaatlontjyaftertngtliephosphoi^ 

4E-BP1andS6K(Fla.2;Ref.25). 

4E^^l9phosphaylatedonmulOpleresl^ 
Ptoybtes the and Tlir-46 reside 

vi*ft a toss of eH^ binding. PtKsphoy^ 
TJMBbieqiiBd for atoseq^ at swefd 

OD0H:teniiM, serum-seosHive sfles; a comWnaJlon of these 
^wsphojylaBon events appears to be needed to bM* the 
■ S^,'^l^*°^'^^^P«^«*of«he>»TM8ene.. 
P3MWSK1 palhvva/. and protein Wnase Cir also 
^e^Plphosphoiylatlon (27-29). 

S6Kand4E-BP1are8lsorBgulated.hpart,byPI3Kandte 
downstream |»oteln Mnase AM. PTEN Is a phosphatase that 
negatively regulates PI3K signaling. PTEN nuD cells have 

wnsWullveV astlve of Akt, with Increased S6K actwity and 
SB phosphorylation (30). S6K activity is Inhibited both by 
PI3K Inhibitors wortnmnnln and LY294002 and by mTOR 
Inhibitor napamydn ^4). Akt phosphoiylates Ser-2448 In 
mTOR In vftm. and this site Is phosphoiylated upon Akt 
a^atfon Inv,^ 01-3^ Thus. mTOR is regulated ly ^ 
Pm/Akt pathway; however, this does liot appear to be the 
only mode of regulation of mTOR acthrtty. Whether the PI3K 
pathway also regulates S6K and 4E-BP1 phosphorylatfon 
Independent of mTOR is controversial. 

interestingly, mTOR autophosphorylatton Is blocked by wort- 
mannln but not by rapamydn (34). TWs seemttig inconsistency 
suggests that mTOR-responslve regulatkMi of 4E-BP1 sBjd SeK 
axdivity occurs throughamechanism other than IntifnstemTDR 
Kinase aotlvily.Anaitemalopalhwayfor4E-BPl andS8Kphos- 
PhwytaUon by mTOR acUvity Is by the Inhlbltton of a phospha- 
lasa Treatment with calyculinA. an Inhibitor of phosphatases 1 
araJ 2A, reduces rapamydn-lnduced dephosphoiylatfon of 4E- 
BP1 andSeKbyrapamycin(35). PP2A Interacts with full-length 
S6K but not with a S6K mutant that is resistant to dephospho- 
rylatton nasulUng from rapamydn. mTOR phosphoiylates PP2A 
In Mtro; however, how this process alters PP2A aclh«y is not 
wwvm^These results are oonsbtertvrtth the n^ 
PhOTbtlon of a phosphatase by mTOR prevents dephospho- 
nrtaton of 4E-BP1 and S6K. and convetseiy. that nutrient dep- 
rtWBtton and rapamydn Wo* InhibltkMi of the phosphatase by 

PoVadenylatJon. "The poWi tali in eukaryotlc mRNA Is 
frnportant In enhancing translation InRlatton and mRNA sta- 
bility. Polyadenylatlon plays a key rdo In reguiaUng gene 
«VJ^on during oogenesis and earty embryogenesis. 

Some mRNA that are translattonally Inactive In the oocyte are 
polyadenylated ooncomltantly with translatlonal activation In 
oocyte maturation, whereas other mRNAs that are transla- 
tlonally active during oogenesis are deadenylated and trans- 
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teBonally silenced (36-38). Thus, control of poly(A) tan syn- 
fliesb Is an important regulatory step In gene expresslor* 
The 5 cap and poly(A) tail are thought to functton synergis: 
tically to regulate ntftNA transiatonal efficiency t39, AOL 

RNA Packaging. Most RNA-bindIng proteins are asLn- 
bled on a transcript at the time of trenscriptioh. thus deter- 
mining the translatlonal fate of the transcrtpt (41). A ttfoWv 
conserved famiV of Y-box proteins te found in cytopiS 
messenger ribonucieoprotein particles, where the proteins 

'"^^ ^ recnrltment of 

mRNA to the translattonal mad.lnery (41-43). The major 
mRWi-assoclated protein. YB-1 . destabllfees the Interac^n 
of elF4Eand the 5' mRi^cap/n vUio. and overwcpiosstonof 
YB-1 results in translatlonal repression In vim UM- ^^us, 
alt^ns in RNA pad<aging can also play an important 
In translattond regulaUon. p«™ni«a 

Translation Alterations Encountered in Cancer 

Three main alterations at the translattonal level occtr In cancer 
jrarfattons in mRNAsequenoes that increase or decrease tians^ 
lational effldency, changes In the expression or avaliabDIhr of 
components of the tiatslaHonal machinery, and activation of 
TOTOlaHon through abenandy acHvaled signal transduction 
pathways. The first afteraHon affects the translation of an Indl- 
vklual mRNA that may play a role In cardnogenesls. Tbe eec- 
ond and third alterations can lead to more global changes, such 

as an Increase In the overaB rate of protein synthesis, and the 
<ranslattonal activation of several mRNA species. 

Variations In mfWA Sequence 

Variations In mRNA sequence affect the translatlonal effi. 



and examples of each mechanism fblkw. 

Mutations. Mutattons In the mRNA sequence, especially 
m tt» 5' vm, can alter Its translattonal effldency. as seen in 
the following examples. 



o«9«R Saltoeta?. proposed thai transtefi^ 
<w>ij«te repressed, virti^ 

tfwltwedetelkKisof themRNAS' UTI^Iian^ 
b mro efftetert Moro ri^^ 

5' im of <>iT?yc contains an 11^ and thus 

tton can bo Irtftlated by a cap-Ind^)endent as vi^ as a 



myeloma, a O-^T mutation In the o-myc IRES was kiei^fled 
(48) aiKttound to cause an entianced Initial 
^ Internal ribospmal entry (4^ 

BACAf. Aaomatic pobit mutation (117 Q^Q In posflkm 
■-3 wllh lespecl to the start c»dcn of the Bfl^ 
Wwitfted bi a highly aggressive sporadic breast cancer t5Q). 
Chlmeite oonstrucfe conslsttng of the wJId-type or mutated 
5' um and a dovinfistieam ludferase repoler dem- 
w^ed adecreasehthotranslattoiialelifc^ 
UTR mutBHon. 

QKC«M*VeiMfefitK^^ Some Inhefited 

melanoma Idndreds have a Q-*T transverslon at base -34 
of cydrwlependent kfr^e lnhibltor-2A, which encodes a 
cycHnKtependent Wnase 4/cycIIn-dependent kinase 6 kinase 
Inhibitor Important In checkpoint regutertlon ©1). This 
mutatk>n,.sfves rise to a novel AUG translation InWatkm 
codon. creating an upstream opoi reading frame that com- 
petes for scanning rft>osorties and decreases trandatkm 
linQm the vvlk^type AUa 

AHem^ SpRcing and Atternale Transcr^on Start 
SKea. AlteiBtkms In spfidng and aRemate transcription sftes 
can lead to varlaitens in 5' inR sequence, length, and s^ 
aiy structure, uWrnateV InripacMng translafional eflte^^ 

iOm The ATM gene has four nonooding exons In its 5' 
UTR that undergo extensive alternative spBdng ©a. The 
contents of 12 different 5' UlRs that show consldetable 

diversity In length and sequence have been Wentlfied, These 
dhferg^t 6' leader sequences play an Important role In the 
fransteflonal regulation of the ATM gene, 
mrfra In a subset of hjmors, ov^presston of the onco- 
I?^"^ ^ enhanced tianslatfon of the md^ 
cnRNA. Use of different promoters leads to two mdm2 tran- 
scripts thai drffer only In theirs' leaders (53). The longer 5' 
LTO contains tvo upstream c>pen reading frames, and this 
mRIMA Is loaded with ribosomes Inefficiently compared with 
the short 5' UTR. 

BRCA1. m a normal mammary gland, BRCA1 mRNA Is 
expressed with a shorter leader sequence (5'UTRa). whereas 
lnsporadlcbreastcancertissue,Bf?CAf mRNA Is expressed 
a tooger leader sequence (5' UTRb); the translatk>nal 
efficiency of transcripts containing 5' UTRb Is 10 times lower 
than that of transcripts containing 5' UTRa (54) 

TGF^fiX VGF^fiS mRNA includes a I.Vkb 5' UTR. which 
exerts an Inhibitory effect on translation. Many human breast 

^.^r^flT^^^^^^^^^ « 7-fbld greater 
transtatlonal efficiency than the nonnal 7XSF^p3 mRNA ©5). 

Alternate Polyadenylatlon Sites. Multiple polyadenyl- 
I^iS!^ '^"9 to the generatkin of several transcripts 
with dKfenng 3' UTR have been described for several mRNA 
^ 7^!^^ 1^ protcKoncogene (56). ATM gene 
(53, tissue Inhibitor of metaIIoprotelnases-3 (57), RHOA 



P«>to^MWo^ne (58). arKt calmodulin-! P9). Although ttio 
effect of these aRemata 3' UTRs on translation Is not yet 
toqwn, tl)ey may be important In RNA-pioteIn htertK^UoriQ 
that affect transtetlbnal recnmment The role of ^ese alter- 
ations In cancer development and progressloii Is unknowm . 

Attefatfofis Ai tho Comipmefile of fhe 
Ttenslaffon MacMhQiy 

AtteraHons In the components of translalion macMnMy can 
take many fomw. ^ 

Owrexpressslon of elF4E. Overaxpresslon of elF4H 
^uses maHgnart tmnsfbnnallon In rodert ceBs (BQ and lh© 
den^Iatton of HeU ceO growth (81). Potunovsky et aL (63) 
foWKl that ^4E overaxpres^ substl^ 



sunrtvai signaling. 

Bevated levels of eiF4E mRNA have been found in a broad 

spectnim <rf transfbnmed ceB Ibies (63). elF4E levels em 

elevated In alt diK^tal cardnomaAisftu specimens and Inva^ 

swe diK:tal carcinomas, compared virith benign breast spec> 

Imens evaluated with Western btol analysis (84, eg. Prelim^ 

Inary studies suggest that this overexjTOseion Is attributable 
to gene ampriflcation (6Q. 

■n>ere ara accwroiating data su^estlng that elF4E ovorax- 
pressk)n can be valuable as a prognostic maricer. eIF4E over- 
expression vras found In a retiospecBve study to be a marto^ 
poor prognosis In stages I to W breast candhoma (67). Verfflca- 

tion erf the piognosllc value of elF4e In breast cancer Is now 
under bi a pio^)^e trial (87). Hoviwer, In a different 
study, eIRE expression was correlated with the aggress 
behavior of non4kKlgWn's lymphomas (68). in a prospective 

analysis of paJJehts with head and neck cancer, elevated levels 
of eIRE in hJsloIoglcany tumor-free si^cal margins predicted 
a slgnHlcandy bicreased risk of local^lonal lecunenoe 
These results aS suggest that elF4E overexpresslon can be 
used to s^patienls who mlgW benefit from inore aggress 
eystenrte therar^. FurthemK)rB. the head and neck cancer 

suggest that efF4E overexpresslon Is a field deftect and can be 
used to guWe local therapy. 

Afteratlons in Other Initiation Factors. Alterations In a 
number of other Initiation factors have been associated with 
cancer. Overproductkm off eIF4Q, simHar to elF4E, leads to 
malignant transfonnatlon In vftno (69). elF-2a Is found In 
increased levels In bronchloloah^eolar carcinomas of the luno 
PMnltJatlon factor eIF-4A1 is overexpressed in melanoma 
(70) and hepatocellular carcinoma (71). The p40 subunit of 
translation initiation factor 3 is amplified and overexpressed 
In breast and prostate cancer (72), and the elF3-p1 10 subunH 
Is overexpressed In testk^iar seminoma (73). Tlie rote that 
overexpresslon of these initiation factors plays on the devel- 
opment and progression of cancer. If any. Is not known 

Overexpresslon of S6K. S6K Is ampllfled and highly 
overaxpressed in the MCF7 breast cancer cell line, com' 
pared with nonnal mammary epithelium (74). In a shidy by 
Barlund ef at. (74), S6K was amplified in 59 of 668 primary 
breast tumors, and a statistically significant association was 
obsen/ed between amplification and poor prognosis. 



*^ RAP bxn«reixi«ssed h Iww^ 

pared with normal and vbally trenslbnned cells (T^ PAP 




jJtte b I(ni>>im, lK»\mer, about hw PAP ewpi^^ 
wiV snecte thetraini^lonal piofite. 
/"<)n^n»inimfr«MlngpR^^ Even le» is known 
abort a teaBons h Fm padcagbig In cancer. Increased at- 
P^bn and nuclear tocaBraJton of the RNA^jftxSng piT^ 
YB-1 am kKfciatore of a ixw prognoeb for bieast canc6r (77^^ 
nwwTOB ceB lung cancer (78), and arafan ctBX» ^ How^ 
wsr, Ihls eBtect msy be mediated at least h part ai the tetfd of 
tensoIplk)n.becaiiseVB.1 Increases cheinoresblanceby en- 
»Bnc»« the 1tansei*>llon of a mul8drt« resistance sene ^ 

AcO/aOm of &gntJ Transtiuctkm Pattmays 
^V«wa8on of transduction pathw^ ly loss of tumor 
suppressor genes or ovenai^jresslon of certdn^ro*»Wnase8 
can contribute to ttw grovirth and aggresslweness of tumore. An 
impoctanl^mulant bi human cancers Is the tumor suppressor 
genePlBf* which toads to the actlwaBon of ttie PI3K«ftkt pa^ 

w^MrtKatlon of POK and Akt Induces the oncogen 
»«ma«on^^ 

show oonstituHve pho^jhoytatlon of S6K and of 4&BP1 {B1) 
A mutant AM thai rstakis kinase activity but does not phos^ 
phoi3telsS6Kor4&BP1 does not lransfomia3ioltets,viWch 
suggests a comelation belw8«i the onoogenfcay of PI3K and 
AW and the ptiosphorylalkMJ of S6K and^e-BPI fBI). 
. Several tyrosine kinases such as platelet-<ferlved grovrth 
factoi; msulln-Bke growth factor. HERZAieu, and epidemwl 
flrowUi factor receptor sere overexpressed in cancer. Be- 
cause these Mnasea acUvate dovwistream signal transduc- 
ton pathways known to alter translatfon InlHaBon. acUvatkm 
Of translatfonbllkely to contribute to the growth and aggres- 
shrwess of these tumors. Firthermore. the mrV4A for many 

Of these kfciases themselves Mo under translattonal control. 
For etample. HER2/neu mRNA Is tianslattonally controHed 
&Mh IV a short upstream open reading frame that represses 
HH^aheu translation in a ceO type-Independent manner and 
By a distinct cell type-dependent mechanism that Increases 
translatlonal efficiency (82). HER2^eu translation is different 
In transformed and nonnal cells. Thus, It Is possible that 
atteratlons at the translatlonal level can In part account for 
IZ^^'*^*^ HEnaSneu gene amplification de- 

terted Ijy fluorescence fri sttu IqrtMdbatton and prolan levels 
detected by Immunohistochemical assays. 

Translation Targets of Selected Cancer llieiapy 
Components of the translation machinery and signal path- 

waore invoSred In the actlvatton of translatton Inlttatlon repre- 
sent good targets for cancer therapy. 

I^tT^IT'^'' ^^"^"^^ "^"^^^y^ ''''^"^^^ 

inK^'''" proliferation of lymphocytes. It was 

Initially developed as an immunosuppressive drug lor organ 



tran^tortatlon. Rapamydn FKBP 12 (FK506*bMlng 
prolefri,M, 12,0)0) Wrids to mTOR to Inhibltlte 

Flapanvc*! OTJses a smaO but e^gnWcam reduclto 
Inlllaltonrateofprot^eynthesls{B3).ftl^^ In 
part by bkxMng 88 phosphoiylatiQn and e^edively 
pressing thelranslaJtonofS'TDPmFa^ such asribos^ 
proteins, and etongatlon factors P3-8^ Rc^jamycin also 
blocks 4&-BP1 phosphoototon and fnhlbtts cap-dependent 
but not cap-Independent transia^ (1 7, 8^ 

The i^3arn«:f n-eerisltlve ^gnal tranaducB^ 



tat^ breast, sfnaa ceB lung, Q[!(A)festo^ 
teutonla are among the cancer Ines most sensitive to the 
r^wnrrycfri mSogue C0779 (VVy^hTl\yeisk Research; ReF. 
^fri rhabdomycosarcoma c^ Ifries, rapanvch b 
static or cyloddal, depending an the p58 status of the oeft p53 
>i^typo cells tiBatedwimrapamychicTOst In the G| pha&o 
and mafrilaln thelrvtabBlty. whereas p53 mutant cete accur^ 
lalBlnQi and undergo apc^>tosis (88, 80). In a recently reported 
study using human prhimive neuroectodermd tumor and 
meduHobfastoma rnodels, rapamydn oditbHed m«B cytotox- 
k% In comWr^rtlon v^ntti dsplatin and camptomedn fto 
sb^le agerit *j vfyo, 00-779 delayed ^twm of xeno 

160% afterlweekof therapy and240%after2weete.Asrngle 
Wgh-dose acMrtisiTatton caused a 37% dectease b tumor 
vofuma Qfowth Wia^ 
cisptelb b combination vrflh 00-779 than wmi 
CM}. Thus, preclinical studies suggest ttiat rapamydn ana- 
logues are useful as sln^ agents and b comtAiatlon wHfi 
diemottienapy. 

Rapamydn analogues CCl-779 and RAD001 0stovartls 
Basel, Switzedand) are now In dlnical trials. Because <rf the 
knovim effect of rapamydn on lymphocyte proliferation, a 
potential prolriem wttli rapamycb analogues is Immunosup- 
pression. However, although prolonged Immunosuppression 
can result from rapamydn and CCl-779 administered on 
contfnuous-dose schedutes, the Immunosuppressive effects 
of rapamydn analogues resolve b -24 h after therapy 
(91). The prindpa! toxicities of CCI-.779 have Induded der- 
matologlcal toxicity, myetosuppresston. fnfedton. mucosltte 
diarrhea, reversible elevations \n liver function tests, hyper ' 
glycemla, hypokalemia, hypocalcemfa, and depression (87 
92-^). Phase II trbb of CCl-779 have been conducted In 
advanced renal cell cardnoma and In stage lll/IV breast 
cancboma patients who felled with prior chemotherapy, b 
the results reported b abstract fonm, although there were no 
complete responses, partial responses were documented b 
both renal cell cardnoma and b breast carcinoma (94, 95). 
Thus, CO-779 has documented prenmlnary dblcai activity b 
a prevbusly treated, unsdected patient populatton. 

Active Investigation Is under way bto patient sdectlon for 
mTOR Inhibitors. Several studies have found an enhanced 
efficacy of CCl-779 In PTEN-null tumors (30, 96). Another 
study found that six of eight breast cancer ceH lines were 
responsive to CO-779, although only two of these lines 
lacked PIEN (97) There was, however, a positive correlation 
between AW acth;atlon and CCl-779 sensitivity (97). This 
correlation suggests that activation of the PI3K-Akt pathway. 



' 9^9 ItamUtonMB^inGuinr 



regardless of whether ltbaM»utebtetoaPrB<inut^ 

MmrapanvdnrestetaiK^ 
naypredtet rapamydn resistance (9^ 

fJ'^^S^hlnWMkwofangtoaeneete.SS^ 
Jrn^ be* through (Hrect Irtil^ 

ZSETiLS?!^ proanglogente factors ^ 

v»ailarenck*enal growth factor In tunwr cells looi. 

Tl» anglogenesis InhlWtor tumstatln, another anticancer 
dmgcunei^uncferstu^^ 

^ ""^^^ endothenal cells and pre- 
VKrts dIssoctaBon of elF4E from 4E-BP1, thereby hWbffiS 
translation, "n^se findings suggest that ^ 
drtheModb are especially sensitive to therapies targeting 
the mTOR^nalbig pathway. " " 

^ '^''^ *'a^ a Incidence of cancer 
(l^.B»AlnhIblts the proHferallon of cancer cells (1^ 

ISSi^'S^ ^^"^ «^ '^™'"9. *e~by ac- 
t«atmg PKR. PKR. In turn phosphorylates and inhibits elF2a 
rwuHIng In the Inhibition of protein synthesis at the level of 
t^nstetton inltotlor. simnarty. clotrimazole, a potent aSj^ 
L^^"*^l'*^^'"**^'»"'*'^«»n9™wththita«h 

c^F2« OOQ. consequently. clotrinia^rfeSS: 
tially decreases the expression of cyclins A, E. and Di 
'BSuBIng in blockage of the cefl cycle In ^ ^ 

^S^^^^'^^^^^PP^^a^M^ developed 
as a gene therapy agent. Adenoviral transfer of mda-T Wd- 
mdaj hduc^ apoptosb In many cancer cells indudhi 
brrast. colorectal, and lung cancer (107-109). Adnnda? al^ 

^a^^^ PKR. Which leads to i«s,*orylatS 
Of eiF2a and Induction of apoptosis (110) 

Ravonokls such as genlstein and quer^tin suppress tu- 
mor ceR ^wth. All three mammalian elF2a kinases, PKR 
^ PERK^EK, are acthraied by 
S?S«rfa1;?r ^ ^ 

Injp^m^ ^"'^ ^"'^ ^ttserise RNA 
ifS^ °* ^^'^ ''^^ ^ proliferation rate 



mfS^Sj^::T?i°*^ ^ tum«1genlS 
SIM J*" ^"^^^ ^ ^"^^'^ RNA decrea^ 

^ agar growth. Increases tumor latency, and Increases the 
rates of tumor doubling times (7). AnSe eiF4^^SW 



"Wrt^oreduces^ 

andtebeenproposedaaapotenlblafluvartlhe^ 
anowwcaneen^ partteutoly wheoefc^ 

Bvfc^ff Selescflto Tte^^ 

A differsnl therapeutic apptoach that takea admnteae of the 
«*anoedcap^tepender*1ransblIon 

• OBMwanlaoehnomirt 

«n«»xe8s.1heywoi«tranElaIenKTOefft^ 
fceiritoodi^ 

flwwding sequence of hapes s6»»*K vftiB 
»«a» gen^ aBows Ibr selecilVB transiatfon of /wpes 
£^ JT***.**^ fl«« In breast «an^«3lSS 
^mparedwUi nonnal mammary ce« Ines and rafiuHs In so- 
lectwe eensHMly to gandctovfr (117). 

. Toward ttw Future 

alterations In translallonal control ocor In cancer. Cancer ceBs 
appea^ to need an aberrantly acthfated translatlonal st^ for 
survlva^. thus gtowbig the targeting of transtatlon 
surprfelngly tow toxfeHy. Components Of the transia^ 

chfciery aich as elF4t and signal transduction pa^^ 
wh«dlntranslatlonlnBIatto^ 

laselslbrcancertherapy. inhibitors Of the mTOR have alreatS 
*ownsoiTO preliminary activity in db^ 

Detter paflent sdectoa response rates to si ngte^aoent theraro 

mTOR Inhibitors are mast likely to ^chWvb dNcal utiily h 
«imblneiBc«the^ 

P«™»»toteadlotheldentMcattonofnewtherBp^^ 
In tt» near future. *^ 
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